dccvalidator

Travis-CI Build Status lifecycle

dccvalidator is a package and Shiny app to perform data validation and QA/QC. It’s used in the AMP-AD and PsychENCODE consortia to validate data prior to data releases.

Installation

You can install dccvalidator with the following command:

devtools::install_github("Sage-Bionetworks/dccvalidator")

Many functions in dccvalidator use reticulate and the Synapse Python client. See the reticulate documentation for information on how to set R to use a specific version of Python if you don’t want to use the default Python installation on your machine. Whichever Python installation you choose should have synapseclient installed.

Because dccvalidator uses reticulate, it is not compatible with the synapser package..

Check data

dccvalidator provides functions for checking the following common data quality issues:

Data submission validation

This package contains a Shiny app to validate manifests and metadata for AMP-AD studies. It uses the dccvalidator package to check for common data quality issues and gives realtime feedback to the data contributor on errors that need to be fixed. The reporting UI is heavily inspired by the MetaDIG project’s metadata quality reports.

The application also allows users to submit documentation of their study, a description of the methods used, etc.

See the customizing dccvalidator vignette for information on how to spin up a customized version of the application

Local development

dccvalidator uses pre-commit hooks to check for common issues, such as code style (which should conform to tidyverse style), code parsability, and up-to-date .Rd documentation. To use, you will need to install pre-commit. If on a Mac, I recommend using homebrew:

brew install pre-commit

Then, within this git repo, run:

pre-commit install

When you commit your changes, pre-commit will run the checks described above, and the commit will fail if the checks do not pass. If you are experiencing issues with the checks and want to commit your work and worry about them later, you can run git commit --no-verify to skip all checks. Or, you can skip certain hooks by their ID (as shown in the file .pre-commit-config.yaml), e.g. SKIP=roxygenize git commit -m "foo".


Please note that the dccvalidator project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.