Developer Notes
R Package
For developing the pkg locally, you can clone the repo, open
gpgr.Rproj in RStudio, then work with
devtools. You’ll also need to install roxytest to get the tests
and documentation working.
| Command | Comment |
|---|---|
devtools::document() |
document the pkg (Cmd+Shift+D (needs shortcut
change)) |
devtools::load_all() |
load the pkg (Cmd+Shift+L) |
devtools::check() |
check the pkg (Cmd+Shift+E) |
devtools::install() |
install the pkg (Cmd+Shift+B) |
R CMD INSTALL --no-multiarch --with-keep.source git/gpgr |
install the pkg into the first library of .libPaths()
(useful when using in conda env) |
Data Version Control
Instead of committing big data files into git, we can
store the files remotely (e.g. Google Drive, AWS S3) and just keep track
of the data updates in git using DVC.
The R package uses the data files when running the code examples and
tests, and when rendering the vignettes. A simple dvc pull
will pull the data from the remote storage and allow these processes to
take place.
| Command | Comments |
|---|---|
dvc init |
initialise dvc |
dvc remote add -d storage gdrive://<...> |
add GDrive remote |
dvc check-ignore * |
check what is dvc-ignored |
dvc list https://github.com/umccr/gpgr inst/extdata |
list dvc’ed data |
dvc add path/to/folder |
adds folder (and its contents) to dvc |
dvc add path/to/file.txt |
adds file.txt to dvc |
dvc push |
pushes dvc data to remote |
dvc pull |
pulls dvc data from remote |
GitHub Actions
- For DVC, make sure you install
dvcand the corresponding remote package e.g.dvc-gdrive,dvc-s3etc., or else you’ll get something completely puzzling like:
WARNING: No file hash info found for 'FOO.txt'. It won't be created.
ERROR: failed to pull data from the cloud - Checkout failed for following targets:
Is your cache up to date?
<https://error.dvc.org/missing-files>