2020-12-23 16:18:03 +01:00
# STRAW2analysis
2020-12-23 17:16:11 +01:00
All analysis for the STRAW project.
2020-12-24 12:04:22 +01:00
To install:
1. Create a conda virtual environment from the `environment.yml` file.
```shell
cd config
conda env create --file environment.yml
conda activate straw2analysis
```
If you have already created this environment, you can update it using:
```shell
conda deactivate
conda env update --file environment.yml
conda activate straw2analysis
```
2021-04-06 16:26:43 +02:00
To use this environment in the Jupyter notebooks under `./exploration/` ,
you can select it under Kernel > Change kernel after running:
```shell
ipython kernel install --user --name=straw2analysis
```
2021-08-19 16:31:42 +02:00
2. Provide a file called `.env` to be used by `python-dotenv` which should be placed in the top folder of the application
2020-12-24 14:03:36 +01:00
and should have the form:
2020-12-24 12:04:22 +01:00
```
2020-12-24 14:03:36 +01:00
DB_PASSWORD=database-password
2021-11-18 17:22:08 +01:00
```
# RAPIDS
To install RAPIDS, follow the [instructions on their webpage ](https://www.rapids.science/1.6/setup/installation/ ).
Here, I include additional information related to the installation and specific to the STRAW2analysis project.
The installation was tested on Windows using Ubuntu 20.04 on Windows Subsystem for Linux ([WSL2](https://docs.microsoft.com/en-us/windows/wsl/install)).
## Custom configuration
### Credentials
As mentioned under [Database in RAPIDS documentation ](https://www.rapids.science/1.6/snippets/database/ ), a `credentials.yaml` file is needed to connect to a database.
It should contain:
```yaml
PSQL_STRAW:
database: staw
host: 212.235.208.113
password: password
port: 5432
user: staw_db
```
where`password` needs to be specified as well.
## Possible installation issues
### Missing dependencies for RPostgres
To install `RPostgres` R package (used to connect to the PostgreSQL database), an error might occur:
```text
------------------------- ANTICONF ERROR ---------------------------
Configuration failed because libpq was not found. Try installing:
* deb: libpq-dev (Debian, Ubuntu, etc)
* rpm: postgresql-devel (Fedora, EPEL)
* rpm: postgreql8-devel, psstgresql92-devel, postgresql93-devel, or postgresql94-devel (Amazon Linux)
* csw: postgresql_dev (Solaris)
* brew: libpq (OSX)
If libpq is already installed, check that either:
(i) 'pkg-config' is in your PATH AND PKG_CONFIG_PATH contains a libpq.pc file; or
(ii) 'pg_config' is in your PATH.
If neither can detect , you can set INCLUDE_DIR
and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
--------------------------[ ERROR MESSAGE ]----------------------------
< stdin > :1:10: fatal error: libpq-fe.h: No such file or directory
compilation terminated.
```
The library requires `libpq` for compiling from source, so install accordingly.
### Timezone environment variable for tidyverse (relevant for WSL2)
One of the R packages, `tidyverse` might need access to the `TZ` environment variable during the installation.
On Ubuntu 20.04 on WSL2 this triggers the following error:
```text
> install.packages('tidyverse')
ERROR: configuration failed for package ‘ xml2’
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down
Warning in system("timedatectl", intern = TRUE) :
running command 'timedatectl' had status 1
Error in loadNamespace(j < - i [ [ 1L ] ] , c ( lib . loc , . libPaths ( ) ) , versionCheck = vI[[j]]) :
namespace ‘ xml2’ 1.3.1 is already loaded, but >= 1.3.2 is required
Calls: < Anonymous > ... namespaceImportFrom -> asNamespace -> loadNamespace
Execution halted
ERROR: lazy loading failed for package ‘ tidyverse’
```
This happens because WSL2 does not use the `timedatectl` service, which provides this variable.
```bash
~$ timedatectl
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down
```
This can be amended by setting the environment variable manually before attempting to install `tidyverse` :
```bash
TZ='Europe/Ljubljana'
```
## Possible runtime issues
### Unix end of line characters
Upon running rapids, an error might occur:
```bash
/usr/bin/env: ‘ python3\r’ : No such file or directory
```
This is due to Windows style end of line characters.
To amend this, I added a `.gitattributes` files to force `git` to checkout `rapids` using Unix EOL characters.
If this still fails, `dos2unix` can be used to change them.