Developer Notes
R Package
For developing the pkg locally, you can clone the repo, open
gpgr.Rproj
in RStudio, then work with
devtools
. You’ll also need to install roxytest to get the tests
and documentation working.
Command | Comment |
---|---|
devtools::document() |
document the pkg (Cmd+Shift+D (needs shortcut
change)) |
devtools::load_all() |
load the pkg (Cmd+Shift+L ) |
devtools::check() |
check the pkg (Cmd+Shift+E ) |
devtools::install() |
install the pkg (Cmd+Shift+B ) |
R CMD INSTALL --no-multiarch --with-keep.source git/gpgr |
install the pkg into the first library of .libPaths()
(useful when using in conda env) |
Data Version Control
Instead of committing big data files into git
, we can
store the files remotely (e.g. Google Drive, AWS S3) and just keep track
of the data updates in git
using DVC.
The R package uses the data files when running the code examples and
tests, and when rendering the vignettes. A simple dvc pull
will pull the data from the remote storage and allow these processes to
take place.
Command | Comments |
---|---|
dvc init |
initialise dvc |
dvc remote add -d storage gdrive://<...> |
add GDrive remote |
dvc check-ignore * |
check what is dvc-ignored |
dvc list https://github.com/umccr/gpgr inst/extdata |
list dvc’ed data |
dvc add path/to/folder |
adds folder (and its contents) to dvc |
dvc add path/to/file.txt |
adds file.txt to dvc |
dvc push |
pushes dvc data to remote |
dvc pull |
pulls dvc data from remote |
GitHub Actions
- For DVC, make sure you install
dvc
and the corresponding remote package e.g.dvc-gdrive
,dvc-s3
etc., or else you’ll get something completely puzzling like:
WARNING: No file hash info found for 'FOO.txt'. It won't be created.
ERROR: failed to pull data from the cloud - Checkout failed for following targets:
Is your cache up to date?
<https://error.dvc.org/missing-files>