All analysis for the STRAW project.
Create a conda virtual environment from the
cd config conda env create --file environment.yml conda activate straw2analysis
If you have already created this environment, you can update it using:
conda deactivate conda env update --file environment.yml conda activate straw2analysis
To use this environment in the Jupyter notebooks under
./exploration/, you can select it under Kernel > Change kernel after running:
ipython kernel install --user --name=straw2analysis
Provide a file called
.envto be used by
python-dotenvwhich should be placed in the top folder of the application and should have the form:
To install RAPIDS, follow the instructions on their webpage.
Here, I include additional information related to the installation and specific to the STRAW2analysis project. The installation was tested on Windows using Ubuntu 20.04 on Windows Subsystem for Linux (WSL2).
As mentioned under Database in RAPIDS documentation, a
credentials.yaml file is needed to connect to a database.
It should contain:
PSQL_STRAW: database: staw host: 220.127.116.11 password: password port: 5432 user: staw_db
password needs to be specified as well.
Possible installation issues
Missing dependencies for RPostgres
RPostgres R package (used to connect to the PostgreSQL database), an error might occur:
------------------------- ANTICONF ERROR --------------------------- Configuration failed because libpq was not found. Try installing: * deb: libpq-dev (Debian, Ubuntu, etc) * rpm: postgresql-devel (Fedora, EPEL) * rpm: postgreql8-devel, psstgresql92-devel, postgresql93-devel, or postgresql94-devel (Amazon Linux) * csw: postgresql_dev (Solaris) * brew: libpq (OSX) If libpq is already installed, check that either: (i) 'pkg-config' is in your PATH AND PKG_CONFIG_PATH contains a libpq.pc file; or (ii) 'pg_config' is in your PATH. If neither can detect , you can set INCLUDE_DIR and LIB_DIR manually via: R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...' --------------------------[ ERROR MESSAGE ]---------------------------- <stdin>:1:10: fatal error: libpq-fe.h: No such file or directory compilation terminated.
The library requires
libpq for compiling from source, so install accordingly.
Timezone environment variable for tidyverse (relevant for WSL2)
One of the R packages,
tidyverse might need access to the
TZ environment variable during the installation.
On Ubuntu 20.04 on WSL2 this triggers the following error:
> install.packages('tidyverse') ERROR: configuration failed for package ‘xml2’ System has not been booted with systemd as init system (PID 1). Can't operate. Failed to create bus connection: Host is down Warning in system("timedatectl", intern = TRUE) : running command 'timedatectl' had status 1 Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : namespace ‘xml2’ 1.3.1 is already loaded, but >= 1.3.2 is required Calls: <Anonymous> ... namespaceImportFrom -> asNamespace -> loadNamespace Execution halted ERROR: lazy loading failed for package ‘tidyverse’
This happens because WSL2 does not use the
timedatectl service, which provides this variable.
~$ timedatectl System has not been booted with systemd as init system (PID 1). Can't operate. Failed to create bus connection: Host is down
Warning message: In system("timedatectl", intern = TRUE) : running command 'timedatectl' had status 1 Execution halted
This can be amended by setting the environment variable manually before attempting to install
Possible runtime issues
Unix end of line characters
Upon running rapids, an error might occur:
/usr/bin/env: ‘python3\r’: No such file or directory
This is due to Windows style end of line characters.
To amend this, I added a
.gitattributes files to force
git to checkout
rapids using Unix EOL characters.
If this still fails,
dos2unix can be used to change them.
System has not been booted with systemd as init system (PID 1)
To update RAPIDS, first pull and merge origin, such as with:
git fetch --progress "origin" refs/heads/master git merge --no-ff origin/master
Next, update the conda and R virtual environment.
R -e 'renv::restore(repos = c(CRAN = "https://packagemanager.rstudio.com/all/__linux__/focal/latest"))'