bigReg: Generalized Linear Models (GLM) for Large Data Sets
Allows the user to carry out GLM on very large
data sets. Data can be created using the data_frame() function and appended
to the object with object$append(data); data_frame and data_matrix objects
are available that allow the user to store large data on disk. The data is
stored as doubles in binary format and any character columns are transformed
to factors and then stored as numeric (binary) data while a look-up table is
stored in a separate .meta_data file in the same folder. The data is stored in
blocks and GLM regression algorithm is modified and carries out a MapReduce-
like algorithm to fit the model. The functions bglm(), and summary()
and bglm_predict() are available for creating and post-processing of models.
The library requires Armadillo installed on your system. It probably won't
function on windows since multi-core processing is done using mclapply()
which forks R on Unix/Linux type operating systems.
||R (≥ 3.2.0), Rcpp (≥ 0.12.3), parallel, methods, stats, uuid (≥ 0.1-2), MASS (≥ 7.3-39)
||Rcpp, RcppArmadillo (≥ 0.5.200.1.0)
||Chibisi Chima-Okereke <chibisi at active-analytics.com>
||GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Please use the canonical form
to link to this page.