![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/carissalow/rapids?style=plastic) [![Snakemake](https://img.shields.io/badge/snakemake-≥5.7.1-brightgreen.svg?style=flat)](https://snakemake.readthedocs.io) [![Documentation Status](https://github.com/carissalow/rapids/workflows/docs/badge.svg)](https://www.rapids.science/) ![tests](https://github.com/carissalow/rapids/workflows/tests/badge.svg) [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg)](code_of_conduct.md) # RAPIDS **R**eproducible **A**nalysis **Pi**peline for **D**ata **S**treams For more information refer to our [documentation](http://www.rapids.science) By [MoSHI](https://www.moshi.pitt.edu/), [University of Pittsburgh](https://www.pitt.edu/) ## Installation For RAPIDS installation refer to to the [documentation](https://www.rapids.science/1.8/setup/installation/) ### For the installation of the Docker version 1. Follow the [instructions](https://www.rapids.science/1.8/setup/installation/) to setup RAPIDS via Docker (from scratch). 2. Delete current contents in /rapids/ folder when in a container session. ``` cd .. rm -rf rapids/{*,.*} cd rapids ``` 3. Clone RAPIDS workspace from Git and checkout a specific branch. ``` git clone "https://repo.ijs.si/junoslukan/rapids.git" . git checkout ``` 4. Install missing “libpq-dev” dependency with bash. ``` apt-get update -y apt-get install -y libpq-dev ``` 5. Restore R venv. Type R to go to the interactive R session and then: ``` renv::restore() ``` 6. Install cr-features module From: https://repo.ijs.si/matjazbostic/calculatingfeatures.git -> branch master. Then follow the "cr-features module" section below. 7. Install all required packages from environment.yml, prune also deletes conda packages not present in environment file. ``` conda env update --file environment.yml –prune ``` 8. If you wish to update your R or Python venvs. ``` R in interactive session: renv::snapshot() Python: conda env export --no-builds | sed 's/^.*libgfortran.*$/ - libgfortran/' | sed 's/^.*mkl=.*$/ - mkl/' > environment.yml ``` ### cr-features module This RAPIDS extension uses cr-features library accessible [here](https://repo.ijs.si/matjazbostic/calculatingfeatures). To use cr-features library: - Follow the installation instructions in the [README.md](https://repo.ijs.si/matjazbostic/calculatingfeatures/-/blob/master/README.md). - Copy built calculatingfeatures folder into the RAPIDS workspace. - Install the cr-features package by: ``` pip install path/to/the/calculatingfeatures/folder e.g. pip install ./calculatingfeatures if the folder is copied to main parent directory cr-features package has to be built and installed everytime to get the newest version. Or an the newest version of the docker image must be used. ``` ## Updating RAPIDS To update RAPIDS, first pull and merge [origin]( https://github.com/carissalow/rapids), such as with: ```commandline git fetch --progress "origin" refs/heads/master git merge --no-ff origin/master ``` Next, update the conda and R virtual environment. ```bash R -e 'renv::restore(repos = c(CRAN = "https://packagemanager.rstudio.com/all/__linux__/focal/latest"))' ``` ## Custom configuration ### Credentials As mentioned under [Database in RAPIDS documentation](https://www.rapids.science/1.6/snippets/database/), a `credentials.yaml` file is needed to connect to a database. It should contain: ```yaml PSQL_STRAW: database: staw host: 212.235.208.113 password: password port: 5432 user: staw_db ``` where`password` needs to be specified as well. ## Possible installation issues ### Missing dependencies for RPostgres To install `RPostgres` R package (used to connect to the PostgreSQL database), an error might occur: ```text ------------------------- ANTICONF ERROR --------------------------- Configuration failed because libpq was not found. Try installing: * deb: libpq-dev (Debian, Ubuntu, etc) * rpm: postgresql-devel (Fedora, EPEL) * rpm: postgreql8-devel, psstgresql92-devel, postgresql93-devel, or postgresql94-devel (Amazon Linux) * csw: postgresql_dev (Solaris) * brew: libpq (OSX) If libpq is already installed, check that either: (i) 'pkg-config' is in your PATH AND PKG_CONFIG_PATH contains a libpq.pc file; or (ii) 'pg_config' is in your PATH. If neither can detect , you can set INCLUDE_DIR and LIB_DIR manually via: R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...' --------------------------[ ERROR MESSAGE ]---------------------------- :1:10: fatal error: libpq-fe.h: No such file or directory compilation terminated. ``` The library requires `libpq` for compiling from source, so install accordingly. ### Timezone environment variable for tidyverse (relevant for WSL2) One of the R packages, `tidyverse` might need access to the `TZ` environment variable during the installation. On Ubuntu 20.04 on WSL2 this triggers the following error: ```text > install.packages('tidyverse') ERROR: configuration failed for package ‘xml2’ System has not been booted with systemd as init system (PID 1). Can't operate. Failed to create bus connection: Host is down Warning in system("timedatectl", intern = TRUE) : running command 'timedatectl' had status 1 Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : namespace ‘xml2’ 1.3.1 is already loaded, but >= 1.3.2 is required Calls: ... namespaceImportFrom -> asNamespace -> loadNamespace Execution halted ERROR: lazy loading failed for package ‘tidyverse’ ``` This happens because WSL2 does not use the `timedatectl` service, which provides this variable. ```bash ~$ timedatectl System has not been booted with systemd as init system (PID 1). Can't operate. Failed to create bus connection: Host is down ``` and later ```bash Warning message: In system("timedatectl", intern = TRUE) : running command 'timedatectl' had status 1 Execution halted ``` This can be amended by setting the environment variable manually before attempting to install `tidyverse`: ```bash export TZ='Europe/Ljubljana' ``` Note: if this is needed to avoid runtime issues, you need to either define this environment variable in each new terminal window or (better) define it in your `~/.bashrc` or `~/.bash_profile`. ## Possible runtime issues ### Unix end of line characters Upon running rapids, an error might occur: ```bash /usr/bin/env: ‘python3\r’: No such file or directory ``` This is due to Windows style end of line characters. To amend this, I added a `.gitattributes` files to force `git` to checkout `rapids` using Unix EOL characters. If this still fails, `dos2unix` can be used to change them. ### System has not been booted with systemd as init system (PID 1) See [the installation issue above](#Timezone-environment-variable-for-tidyverse-(relevant-for-WSL2)).