162 lines
5.0 KiB
Markdown
162 lines
5.0 KiB
Markdown
# STRAW2analysis
|
||
|
||
All analysis for the STRAW project.
|
||
|
||
To install:
|
||
|
||
1. Create a conda virtual environment from the `environment.yml` file.
|
||
|
||
```shell
|
||
cd config
|
||
conda env create --file environment.yml
|
||
conda activate straw2analysis
|
||
```
|
||
|
||
If you have already created this environment, you can update it using:
|
||
|
||
```shell
|
||
conda deactivate
|
||
conda env update --file environment.yml
|
||
conda activate straw2analysis
|
||
```
|
||
|
||
To use this environment in the Jupyter notebooks under `./exploration/`,
|
||
you can select it under Kernel > Change kernel after running:
|
||
|
||
```shell
|
||
ipython kernel install --user --name=straw2analysis
|
||
```
|
||
|
||
2. Provide a file called `.env` to be used by `python-dotenv` which should be placed in the top folder of the application
|
||
and should have the form:
|
||
|
||
```
|
||
DB_PASSWORD=database-password
|
||
```
|
||
|
||
# RAPIDS
|
||
|
||
To install RAPIDS, follow the [instructions on their webpage](https://www.rapids.science/1.6/setup/installation/).
|
||
|
||
Here, I include additional information related to the installation and specific to the STRAW2analysis project.
|
||
The installation was tested on Windows using Ubuntu 20.04 on Windows Subsystem for Linux ([WSL2](https://docs.microsoft.com/en-us/windows/wsl/install)).
|
||
|
||
## Custom configuration
|
||
### Credentials
|
||
|
||
As mentioned under [Database in RAPIDS documentation](https://www.rapids.science/1.6/snippets/database/), a `credentials.yaml` file is needed to connect to a database.
|
||
It should contain:
|
||
|
||
```yaml
|
||
PSQL_STRAW:
|
||
database: staw
|
||
host: 212.235.208.113
|
||
password: password
|
||
port: 5432
|
||
user: staw_db
|
||
```
|
||
|
||
where`password` needs to be specified as well.
|
||
|
||
## Possible installation issues
|
||
### Missing dependencies for RPostgres
|
||
|
||
To install `RPostgres` R package (used to connect to the PostgreSQL database), an error might occur:
|
||
|
||
```text
|
||
------------------------- ANTICONF ERROR ---------------------------
|
||
Configuration failed because libpq was not found. Try installing:
|
||
* deb: libpq-dev (Debian, Ubuntu, etc)
|
||
* rpm: postgresql-devel (Fedora, EPEL)
|
||
* rpm: postgreql8-devel, psstgresql92-devel, postgresql93-devel, or postgresql94-devel (Amazon Linux)
|
||
* csw: postgresql_dev (Solaris)
|
||
* brew: libpq (OSX)
|
||
If libpq is already installed, check that either:
|
||
(i) 'pkg-config' is in your PATH AND PKG_CONFIG_PATH contains a libpq.pc file; or
|
||
(ii) 'pg_config' is in your PATH.
|
||
If neither can detect , you can set INCLUDE_DIR
|
||
and LIB_DIR manually via:
|
||
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
|
||
--------------------------[ ERROR MESSAGE ]----------------------------
|
||
<stdin>:1:10: fatal error: libpq-fe.h: No such file or directory
|
||
compilation terminated.
|
||
```
|
||
|
||
The library requires `libpq` for compiling from source, so install accordingly.
|
||
|
||
### Timezone environment variable for tidyverse (relevant for WSL2)
|
||
|
||
One of the R packages, `tidyverse` might need access to the `TZ` environment variable during the installation.
|
||
On Ubuntu 20.04 on WSL2 this triggers the following error:
|
||
|
||
```text
|
||
> install.packages('tidyverse')
|
||
|
||
ERROR: configuration failed for package ‘xml2’
|
||
System has not been booted with systemd as init system (PID 1). Can't operate.
|
||
Failed to create bus connection: Host is down
|
||
Warning in system("timedatectl", intern = TRUE) :
|
||
running command 'timedatectl' had status 1
|
||
Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) :
|
||
namespace ‘xml2’ 1.3.1 is already loaded, but >= 1.3.2 is required
|
||
Calls: <Anonymous> ... namespaceImportFrom -> asNamespace -> loadNamespace
|
||
Execution halted
|
||
ERROR: lazy loading failed for package ‘tidyverse’
|
||
```
|
||
|
||
This happens because WSL2 does not use the `timedatectl` service, which provides this variable.
|
||
|
||
```bash
|
||
~$ timedatectl
|
||
System has not been booted with systemd as init system (PID 1). Can't operate.
|
||
Failed to create bus connection: Host is down
|
||
```
|
||
|
||
and later
|
||
|
||
```bash
|
||
Warning message:
|
||
In system("timedatectl", intern = TRUE) :
|
||
running command 'timedatectl' had status 1
|
||
Execution halted
|
||
```
|
||
|
||
This can be amended by setting the environment variable manually before attempting to install `tidyverse`:
|
||
|
||
```bash
|
||
export TZ='Europe/Ljubljana'
|
||
```
|
||
|
||
## Possible runtime issues
|
||
### Unix end of line characters
|
||
|
||
Upon running rapids, an error might occur:
|
||
|
||
```bash
|
||
/usr/bin/env: ‘python3\r’: No such file or directory
|
||
```
|
||
|
||
This is due to Windows style end of line characters.
|
||
To amend this, I added a `.gitattributes` files to force `git` to checkout `rapids` using Unix EOL characters.
|
||
If this still fails, `dos2unix` can be used to change them.
|
||
|
||
### System has not been booted with systemd as init system (PID 1)
|
||
|
||
See [the installation issue above](#Timezone-environment-variable-for-tidyverse-(relevant-for-WSL2)).
|
||
|
||
## Update RAPIDS
|
||
|
||
To update RAPIDS, first pull and merge [origin]( https://github.com/carissalow/rapids), such as with:
|
||
|
||
```commandline
|
||
git fetch --progress "origin" refs/heads/master
|
||
git merge --no-ff origin/master
|
||
```
|
||
|
||
Next, update the conda and R virtual environment.
|
||
|
||
```bash
|
||
R -e 'renv::restore(repos = c(CRAN = "https://packagemanager.rstudio.com/all/__linux__/focal/latest"))'
|
||
```
|
||
|