Data quality issues such as missing values and outliers are often interdependent, which makes preprocessing both time-consuming and leads to suboptimal performance in knowledge discovery tasks. This package supports preprocessing decision making by visualizing interdependent data quality issues through means of feature construction. The user can define his own application domain specific constructed features that express the quality of a data point such as number of missing values in the point or use nine default features. The outcome can be explored with plot methods and the feature constructed data acquired with get methods.

Documentation

Manual: preproviz.pdf
Vignette: Preproviz

Maintainer: Markus Vattulainen <markus.vattulainen at gmail.com>

Author(s): Markus Vattulainen*

Install package and any missing dependencies by running this line in your R console:

install.packages("preproviz")

Depends R (>= 3.2.2)
Imports caret, DMwR, randomForest, ClustOfVar, reshape2, ggplot2, ggdendro, gridExtra, methods, utils, stats
Suggests testthat, rmarkdown, knitr, preprocomb
Enhances
Linking to
Reverse
depends
Reverse
imports
Reverse
suggests
preprocomb, preprosim
Reverse
enhances
Reverse
linking to

Package preproviz
Materials
URL https://github.com/mvattulainen/preproviz
Task Views
Version 0.2.0
Published 2016-07-09
License GPL-2
BugReports https://github.com/mvattulainen/preproviz/issues
SystemRequirements
NeedsCompilation no
Citation
CRAN checks preproviz check results
Package source preproviz_0.2.0.tar.gz