202 lines
6.9 KiB
Markdown
202 lines
6.9 KiB
Markdown
![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/carissalow/rapids?style=plastic)
|
||
[![Snakemake](https://img.shields.io/badge/snakemake-≥5.7.1-brightgreen.svg?style=flat)](https://snakemake.readthedocs.io)
|
||
[![Documentation Status](https://github.com/carissalow/rapids/workflows/docs/badge.svg)](https://www.rapids.science/)
|
||
![tests](https://github.com/carissalow/rapids/workflows/tests/badge.svg)
|
||
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg)](code_of_conduct.md)
|
||
|
||
# RAPIDS
|
||
|
||
**R**eproducible **A**nalysis **Pi**peline for **D**ata **S**treams
|
||
|
||
For more information refer to our [documentation](http://www.rapids.science)
|
||
|
||
By [MoSHI](https://www.moshi.pitt.edu/), [University of Pittsburgh](https://www.pitt.edu/)
|
||
|
||
## Installation
|
||
|
||
For RAPIDS installation refer to to the [documentation](https://www.rapids.science/1.8/setup/installation/)
|
||
|
||
### For the installation of the Docker version
|
||
|
||
1. Follow the [instructions](https://www.rapids.science/1.8/setup/installation/) to setup RAPIDS via Docker (from scratch).
|
||
|
||
2. Delete current contents in /rapids/ folder when in a container session.
|
||
```
|
||
cd ..
|
||
rm -rf rapids/{*,.*}
|
||
cd rapids
|
||
```
|
||
|
||
3. Clone RAPIDS workspace from Git and checkout a specific branch.
|
||
```
|
||
git clone "https://repo.ijs.si/junoslukan/rapids.git" .
|
||
git checkout <branch_name>
|
||
```
|
||
|
||
4. Install missing “libpq-dev” dependency with bash.
|
||
```
|
||
apt-get update -y
|
||
apt-get install -y libpq-dev
|
||
```
|
||
|
||
5. Restore R venv.
|
||
Type R to go to the interactive R session and then:
|
||
```
|
||
renv::restore()
|
||
```
|
||
|
||
6. Install cr-features module
|
||
From: https://repo.ijs.si/matjazbostic/calculatingfeatures.git -> branch master.
|
||
Then follow the "cr-features module" section below.
|
||
|
||
7. Install all required packages from environment.yml, prune also deletes conda packages not present in environment file.
|
||
```
|
||
conda env update --file environment.yml –prune
|
||
```
|
||
|
||
8. If you wish to update your R or Python venvs.
|
||
```
|
||
R in interactive session:
|
||
renv::snapshot()
|
||
Python:
|
||
conda env export --no-builds | sed 's/^.*libgfortran.*$/ - libgfortran/' | sed 's/^.*mkl=.*$/ - mkl/' > environment.yml
|
||
```
|
||
|
||
### cr-features module
|
||
|
||
This RAPIDS extension uses cr-features library accessible [here](https://repo.ijs.si/matjazbostic/calculatingfeatures).
|
||
|
||
To use cr-features library:
|
||
|
||
- Follow the installation instructions in the [README.md](https://repo.ijs.si/matjazbostic/calculatingfeatures/-/blob/master/README.md).
|
||
|
||
- Copy built calculatingfeatures folder into the RAPIDS workspace.
|
||
|
||
- Install the cr-features package by:
|
||
```
|
||
pip install path/to/the/calculatingfeatures/folder
|
||
e.g. pip install ./calculatingfeatures if the folder is copied to main parent directory
|
||
cr-features package has to be built and installed everytime to get the newest version.
|
||
Or an the newest version of the docker image must be used.
|
||
```
|
||
|
||
## Updating RAPIDS
|
||
|
||
To update RAPIDS, first pull and merge [origin]( https://github.com/carissalow/rapids), such as with:
|
||
|
||
```commandline
|
||
git fetch --progress "origin" refs/heads/master
|
||
git merge --no-ff origin/master
|
||
```
|
||
|
||
Next, update the conda and R virtual environment.
|
||
|
||
```bash
|
||
R -e 'renv::restore(repos = c(CRAN = "https://packagemanager.rstudio.com/all/__linux__/focal/latest"))'
|
||
```
|
||
|
||
## Custom configuration
|
||
### Credentials
|
||
|
||
As mentioned under [Database in RAPIDS documentation](https://www.rapids.science/1.6/snippets/database/), a `credentials.yaml` file is needed to connect to a database.
|
||
It should contain:
|
||
|
||
```yaml
|
||
PSQL_STRAW:
|
||
database: staw
|
||
host: 212.235.208.113
|
||
password: password
|
||
port: 5432
|
||
user: staw_db
|
||
```
|
||
|
||
where`password` needs to be specified as well.
|
||
|
||
## Possible installation issues
|
||
### Missing dependencies for RPostgres
|
||
|
||
To install `RPostgres` R package (used to connect to the PostgreSQL database), an error might occur:
|
||
|
||
```text
|
||
------------------------- ANTICONF ERROR ---------------------------
|
||
Configuration failed because libpq was not found. Try installing:
|
||
* deb: libpq-dev (Debian, Ubuntu, etc)
|
||
* rpm: postgresql-devel (Fedora, EPEL)
|
||
* rpm: postgreql8-devel, psstgresql92-devel, postgresql93-devel, or postgresql94-devel (Amazon Linux)
|
||
* csw: postgresql_dev (Solaris)
|
||
* brew: libpq (OSX)
|
||
If libpq is already installed, check that either:
|
||
(i) 'pkg-config' is in your PATH AND PKG_CONFIG_PATH contains a libpq.pc file; or
|
||
(ii) 'pg_config' is in your PATH.
|
||
If neither can detect , you can set INCLUDE_DIR
|
||
and LIB_DIR manually via:
|
||
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
|
||
--------------------------[ ERROR MESSAGE ]----------------------------
|
||
<stdin>:1:10: fatal error: libpq-fe.h: No such file or directory
|
||
compilation terminated.
|
||
```
|
||
|
||
The library requires `libpq` for compiling from source, so install accordingly.
|
||
|
||
### Timezone environment variable for tidyverse (relevant for WSL2)
|
||
|
||
One of the R packages, `tidyverse` might need access to the `TZ` environment variable during the installation.
|
||
On Ubuntu 20.04 on WSL2 this triggers the following error:
|
||
|
||
```text
|
||
> install.packages('tidyverse')
|
||
|
||
ERROR: configuration failed for package ‘xml2’
|
||
System has not been booted with systemd as init system (PID 1). Can't operate.
|
||
Failed to create bus connection: Host is down
|
||
Warning in system("timedatectl", intern = TRUE) :
|
||
running command 'timedatectl' had status 1
|
||
Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) :
|
||
namespace ‘xml2’ 1.3.1 is already loaded, but >= 1.3.2 is required
|
||
Calls: <Anonymous> ... namespaceImportFrom -> asNamespace -> loadNamespace
|
||
Execution halted
|
||
ERROR: lazy loading failed for package ‘tidyverse’
|
||
```
|
||
|
||
This happens because WSL2 does not use the `timedatectl` service, which provides this variable.
|
||
|
||
```bash
|
||
~$ timedatectl
|
||
System has not been booted with systemd as init system (PID 1). Can't operate.
|
||
Failed to create bus connection: Host is down
|
||
```
|
||
|
||
and later
|
||
|
||
```bash
|
||
Warning message:
|
||
In system("timedatectl", intern = TRUE) :
|
||
running command 'timedatectl' had status 1
|
||
Execution halted
|
||
```
|
||
|
||
This can be amended by setting the environment variable manually before attempting to install `tidyverse`:
|
||
|
||
```bash
|
||
export TZ='Europe/Ljubljana'
|
||
```
|
||
|
||
Note: if this is needed to avoid runtime issues, you need to either define this environment variable in each new terminal window or (better) define it in your `~/.bashrc` or `~/.bash_profile`.
|
||
|
||
## Possible runtime issues
|
||
### Unix end of line characters
|
||
|
||
Upon running rapids, an error might occur:
|
||
|
||
```bash
|
||
/usr/bin/env: ‘python3\r’: No such file or directory
|
||
```
|
||
|
||
This is due to Windows style end of line characters.
|
||
To amend this, I added a `.gitattributes` files to force `git` to checkout `rapids` using Unix EOL characters.
|
||
If this still fails, `dos2unix` can be used to change them.
|
||
|
||
### System has not been booted with systemd as init system (PID 1)
|
||
|
||
See [the installation issue above](#Timezone-environment-variable-for-tidyverse-(relevant-for-WSL2)).
|