stress_at_work_analysis/README.md

162 lines
5.0 KiB
Markdown
Raw Normal View History

2020-12-23 16:18:03 +01:00
# STRAW2analysis
2020-12-23 17:16:11 +01:00
All analysis for the STRAW project.
To install:
1. Create a conda virtual environment from the `environment.yml` file.
```shell
cd config
conda env create --file environment.yml
conda activate straw2analysis
```
If you have already created this environment, you can update it using:
```shell
conda deactivate
conda env update --file environment.yml
conda activate straw2analysis
```
To use this environment in the Jupyter notebooks under `./exploration/`,
you can select it under Kernel > Change kernel after running:
```shell
ipython kernel install --user --name=straw2analysis
```
2. Provide a file called `.env` to be used by `python-dotenv` which should be placed in the top folder of the application
and should have the form:
```
DB_PASSWORD=database-password
2021-11-18 17:22:08 +01:00
```
# RAPIDS
To install RAPIDS, follow the [instructions on their webpage](https://www.rapids.science/1.6/setup/installation/).
Here, I include additional information related to the installation and specific to the STRAW2analysis project.
The installation was tested on Windows using Ubuntu 20.04 on Windows Subsystem for Linux ([WSL2](https://docs.microsoft.com/en-us/windows/wsl/install)).
## Custom configuration
### Credentials
As mentioned under [Database in RAPIDS documentation](https://www.rapids.science/1.6/snippets/database/), a `credentials.yaml` file is needed to connect to a database.
It should contain:
```yaml
PSQL_STRAW:
database: staw
host: 212.235.208.113
password: password
port: 5432
user: staw_db
```
where`password` needs to be specified as well.
## Possible installation issues
### Missing dependencies for RPostgres
To install `RPostgres` R package (used to connect to the PostgreSQL database), an error might occur:
```text
------------------------- ANTICONF ERROR ---------------------------
Configuration failed because libpq was not found. Try installing:
* deb: libpq-dev (Debian, Ubuntu, etc)
* rpm: postgresql-devel (Fedora, EPEL)
* rpm: postgreql8-devel, psstgresql92-devel, postgresql93-devel, or postgresql94-devel (Amazon Linux)
* csw: postgresql_dev (Solaris)
* brew: libpq (OSX)
If libpq is already installed, check that either:
(i) 'pkg-config' is in your PATH AND PKG_CONFIG_PATH contains a libpq.pc file; or
(ii) 'pg_config' is in your PATH.
If neither can detect , you can set INCLUDE_DIR
and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
--------------------------[ ERROR MESSAGE ]----------------------------
<stdin>:1:10: fatal error: libpq-fe.h: No such file or directory
compilation terminated.
```
The library requires `libpq` for compiling from source, so install accordingly.
### Timezone environment variable for tidyverse (relevant for WSL2)
One of the R packages, `tidyverse` might need access to the `TZ` environment variable during the installation.
On Ubuntu 20.04 on WSL2 this triggers the following error:
```text
> install.packages('tidyverse')
ERROR: configuration failed for package xml2
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down
Warning in system("timedatectl", intern = TRUE) :
running command 'timedatectl' had status 1
Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) :
namespace xml2 1.3.1 is already loaded, but >= 1.3.2 is required
Calls: <Anonymous> ... namespaceImportFrom -> asNamespace -> loadNamespace
Execution halted
ERROR: lazy loading failed for package tidyverse
```
This happens because WSL2 does not use the `timedatectl` service, which provides this variable.
```bash
~$ timedatectl
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down
```
2021-11-29 18:37:50 +01:00
and later
```bash
Warning message:
In system("timedatectl", intern = TRUE) :
running command 'timedatectl' had status 1
Execution halted
```
2021-11-18 17:22:08 +01:00
This can be amended by setting the environment variable manually before attempting to install `tidyverse`:
```bash
export TZ='Europe/Ljubljana'
2021-11-18 17:22:08 +01:00
```
## Possible runtime issues
### Unix end of line characters
Upon running rapids, an error might occur:
```bash
/usr/bin/env: python3\r: No such file or directory
```
This is due to Windows style end of line characters.
To amend this, I added a `.gitattributes` files to force `git` to checkout `rapids` using Unix EOL characters.
If this still fails, `dos2unix` can be used to change them.
2021-11-29 18:37:50 +01:00
### System has not been booted with systemd as init system (PID 1)
2021-12-15 20:21:59 +01:00
See [the installation issue above](#Timezone-environment-variable-for-tidyverse-(relevant-for-WSL2)).
## Update RAPIDS
To update RAPIDS, first pull and merge [origin]( https://github.com/carissalow/rapids), such as with:
```commandline
git fetch --progress "origin" refs/heads/master
git merge --no-ff origin/master
```
Next, update the conda and R virtual environment.
```bash
R -e 'renv::restore(repos = c(CRAN = "https://packagemanager.rstudio.com/all/__linux__/focal/latest"))'
```