2020-12-18 17:03:58 +01:00
![GitHub release (latest SemVer) ](https://img.shields.io/github/v/release/carissalow/rapids?style=plastic )
2020-03-31 01:05:30 +02:00
[![Snakemake ](https://img.shields.io/badge/snakemake-≥5.7.1-brightgreen.svg?style=flat )](https://snakemake.readthedocs.io)
2020-11-30 19:51:05 +01:00
[![Documentation Status ](https://github.com/carissalow/rapids/workflows/docs/badge.svg )](https://www.rapids.science/)
2020-12-18 17:03:58 +01:00
![tests ](https://github.com/carissalow/rapids/workflows/tests/badge.svg )
2020-11-30 19:51:05 +01:00
[![Contributor Covenant ](https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg )](code_of_conduct.md)
2020-03-31 01:05:30 +02:00
2020-03-05 20:53:54 +01:00
# RAPIDS
2019-10-22 19:11:01 +02:00
2020-03-05 20:53:54 +01:00
**R**eproducible **A**nalysis **Pi**peline for **D**ata **S**treams
2019-10-22 19:11:01 +02:00
2020-12-12 23:51:01 +01:00
For more information refer to our [documentation ](http://www.rapids.science )
2019-10-22 19:11:01 +02:00
2020-03-05 20:53:54 +01:00
By [MoSHI ](https://www.moshi.pitt.edu/ ), [University of Pittsburgh ](https://www.pitt.edu/ )
2022-03-30 18:17:07 +02:00
## Installation
For RAPIDS installation refer to to the [documentation ](https://www.rapids.science/1.8/setup/installation/ )
2022-12-09 15:26:12 +01:00
### For the installation of the Docker version
2022-07-07 17:30:07 +02:00
1. Follow the [instructions ](https://www.rapids.science/1.8/setup/installation/ ) to setup RAPIDS via Docker (from scratch).
2. Delete current contents in /rapids/ folder when in a container session.
```
cd ..
rm -rf rapids/{*,.*}
cd rapids
```
3. Clone RAPIDS workspace from Git and checkout a specific branch.
```
git clone "https://repo.ijs.si/junoslukan/rapids.git" .
git checkout < branch_name >
```
4. Install missing “libpq-dev” dependency with bash.
```
apt-get update -y
apt-get install -y libpq-dev
```
5. Restore R venv.
Type R to go to the interactive R session and then:
```
renv::restore()
```
6. Install cr-features module
2022-12-09 17:04:11 +01:00
From: https://repo.ijs.si/matjazbostic/calculatingfeatures.git -> branch master.
2022-07-08 12:40:08 +02:00
Then follow the "cr-features module" section below.
2022-07-07 17:30:07 +02:00
7. Install all required packages from environment.yml, prune also deletes conda packages not present in environment file.
2022-07-20 15:51:22 +02:00
```
conda env update --file environment.yml – prune
```
2022-07-07 17:30:07 +02:00
8. If you wish to update your R or Python venvs.
```
2022-07-08 12:40:08 +02:00
R in interactive session:
renv::snapshot()
Python:
2022-07-20 15:51:22 +02:00
conda env export --no-builds | sed 's/^.*libgfortran.*$/ - libgfortran/' | sed 's/^.*mkl=.*$/ - mkl/' > environment.yml
2022-07-07 17:30:07 +02:00
```
2022-12-09 15:26:12 +01:00
### cr-features module
2022-03-30 18:17:07 +02:00
2022-07-20 15:51:22 +02:00
This RAPIDS extension uses cr-features library accessible [here ](https://repo.ijs.si/matjazbostic/calculatingfeatures ).
2022-03-30 18:17:07 +02:00
2022-07-20 15:51:22 +02:00
To use cr-features library:
2022-07-07 17:30:07 +02:00
2022-03-30 18:17:07 +02:00
- Follow the installation instructions in the [README.md ](https://repo.ijs.si/matjazbostic/calculatingfeatures/-/blob/master/README.md ).
- Copy built calculatingfeatures folder into the RAPIDS workspace.
2022-07-20 15:51:22 +02:00
- Install the cr-features package by:
2022-03-30 18:17:07 +02:00
```
2022-07-20 15:51:22 +02:00
pip install path/to/the/calculatingfeatures/folder
e.g. pip install ./calculatingfeatures if the folder is copied to main parent directory
cr-features package has to be built and installed everytime to get the newest version.
Or an the newest version of the docker image must be used.
2022-12-09 15:26:12 +01:00
```
## Updating RAPIDS
To update RAPIDS, first pull and merge [origin ]( https://github.com/carissalow/rapids ), such as with:
```commandline
git fetch --progress "origin" refs/heads/master
git merge --no-ff origin/master
```
Next, update the conda and R virtual environment.
```bash
R -e 'renv::restore(repos = c(CRAN = "https://packagemanager.rstudio.com/all/__linux__/focal/latest"))'
```
## Custom configuration
### Credentials
As mentioned under [Database in RAPIDS documentation ](https://www.rapids.science/1.6/snippets/database/ ), a `credentials.yaml` file is needed to connect to a database.
It should contain:
```yaml
PSQL_STRAW:
database: staw
host: 212.235.208.113
password: password
port: 5432
user: staw_db
```
where`password` needs to be specified as well.
## Possible installation issues
### Missing dependencies for RPostgres
To install `RPostgres` R package (used to connect to the PostgreSQL database), an error might occur:
```text
------------------------- ANTICONF ERROR ---------------------------
Configuration failed because libpq was not found. Try installing:
* deb: libpq-dev (Debian, Ubuntu, etc)
* rpm: postgresql-devel (Fedora, EPEL)
* rpm: postgreql8-devel, psstgresql92-devel, postgresql93-devel, or postgresql94-devel (Amazon Linux)
* csw: postgresql_dev (Solaris)
* brew: libpq (OSX)
If libpq is already installed, check that either:
(i) 'pkg-config' is in your PATH AND PKG_CONFIG_PATH contains a libpq.pc file; or
(ii) 'pg_config' is in your PATH.
If neither can detect , you can set INCLUDE_DIR
and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
--------------------------[ ERROR MESSAGE ]----------------------------
< stdin > :1:10: fatal error: libpq-fe.h: No such file or directory
compilation terminated.
```
The library requires `libpq` for compiling from source, so install accordingly.
### Timezone environment variable for tidyverse (relevant for WSL2)
One of the R packages, `tidyverse` might need access to the `TZ` environment variable during the installation.
On Ubuntu 20.04 on WSL2 this triggers the following error:
```text
> install.packages('tidyverse')
ERROR: configuration failed for package ‘ xml2’
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down
Warning in system("timedatectl", intern = TRUE) :
running command 'timedatectl' had status 1
Error in loadNamespace(j < - i [ [ 1L ] ] , c ( lib . loc , . libPaths ( ) ) , versionCheck = vI[[j]]) :
namespace ‘ xml2’ 1.3.1 is already loaded, but >= 1.3.2 is required
Calls: < Anonymous > ... namespaceImportFrom -> asNamespace -> loadNamespace
Execution halted
ERROR: lazy loading failed for package ‘ tidyverse’
```
This happens because WSL2 does not use the `timedatectl` service, which provides this variable.
```bash
~$ timedatectl
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down
```
and later
```bash
Warning message:
In system("timedatectl", intern = TRUE) :
running command 'timedatectl' had status 1
Execution halted
```
This can be amended by setting the environment variable manually before attempting to install `tidyverse` :
```bash
export TZ='Europe/Ljubljana'
```
2022-12-09 15:34:06 +01:00
Note: if this is needed to avoid runtime issues, you need to either define this environment variable in each new terminal window or (better) define it in your `~/.bashrc` or `~/.bash_profile` .
2022-12-09 15:26:12 +01:00
## Possible runtime issues
### Unix end of line characters
Upon running rapids, an error might occur:
```bash
/usr/bin/env: ‘ python3\r’ : No such file or directory
```
This is due to Windows style end of line characters.
To amend this, I added a `.gitattributes` files to force `git` to checkout `rapids` using Unix EOL characters.
If this still fails, `dos2unix` can be used to change them.
### System has not been booted with systemd as init system (PID 1)
See [the installation issue above ](#Timezone-environment-variable-for-tidyverse-(relevant-for-WSL2 )).