Go to file
junos 7d0355d095 Gather usernames from DB to create CSV. 2021-12-01 17:22:22 +01:00
.idea Add functions to prepare participants files. 2021-11-24 19:09:06 +01:00
config Merge branch 'ambient' into ml_pipeline 2021-11-17 10:39:55 +01:00
data Add tests for proximity. 2021-08-17 10:51:51 +02:00
exploration Add a demo of pipeline. 2021-11-17 10:44:49 +01:00
features Export timezone info, too. 2021-12-01 17:08:24 +01:00
machine_learning Add a demo of pipeline. 2021-11-17 10:44:49 +01:00
participants Export timezone info, too. 2021-12-01 17:08:24 +01:00
rapids@37b3460b76 Gather usernames from DB to create CSV. 2021-12-01 17:22:22 +01:00
statistical_analysis Add a demo of pipeline. 2021-11-17 10:44:49 +01:00
test [WIP] Fix tests to use pyprojroot. 2021-10-29 12:07:12 +02:00
.gitignore Further refactor by moving helper functions. 2021-09-15 15:14:54 +02:00
.gitmodules Add the rapids fork submodule. 2021-11-17 12:47:42 +01:00
README.md Make the participants rule run. 2021-11-29 18:37:50 +01:00
setup.py Add a method to get Calls data. 2021-01-05 17:00:45 +01:00

README.md

STRAW2analysis

All analysis for the STRAW project.

To install:

  1. Create a conda virtual environment from the environment.yml file.

    cd config
    conda env create --file environment.yml
    conda activate straw2analysis
    

    If you have already created this environment, you can update it using:

    conda deactivate
    conda env update --file environment.yml
    conda activate straw2analysis
    

    To use this environment in the Jupyter notebooks under ./exploration/, you can select it under Kernel > Change kernel after running:

    ipython kernel install --user --name=straw2analysis
    
  2. Provide a file called .env to be used by python-dotenv which should be placed in the top folder of the application and should have the form:

    DB_PASSWORD=database-password
    

RAPIDS

To install RAPIDS, follow the instructions on their webpage.

Here, I include additional information related to the installation and specific to the STRAW2analysis project. The installation was tested on Windows using Ubuntu 20.04 on Windows Subsystem for Linux (WSL2).

Custom configuration

Credentials

As mentioned under Database in RAPIDS documentation, a credentials.yaml file is needed to connect to a database. It should contain:

PSQL_STRAW:
  database: staw
  host: 212.235.208.113
  password: password
  port: 5432
  user: staw_db

wherepassword needs to be specified as well.

Possible installation issues

Missing dependencies for RPostgres

To install RPostgres R package (used to connect to the PostgreSQL database), an error might occur:

------------------------- ANTICONF ERROR ---------------------------
Configuration failed because libpq was not found. Try installing:
   * deb: libpq-dev (Debian, Ubuntu, etc)
   * rpm: postgresql-devel (Fedora, EPEL)
   * rpm: postgreql8-devel, psstgresql92-devel, postgresql93-devel, or postgresql94-devel (Amazon Linux)
   * csw: postgresql_dev (Solaris)
   * brew: libpq (OSX)
If libpq is already installed, check that either:
  (i)  'pkg-config' is in your PATH AND PKG_CONFIG_PATH contains a libpq.pc file; or
  (ii) 'pg_config' is in your PATH.
If neither can detect , you can set INCLUDE_DIR
and LIB_DIR manually via:
  R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
--------------------------[ ERROR MESSAGE ]----------------------------
  <stdin>:1:10: fatal error: libpq-fe.h: No such file or directory
compilation terminated.

The library requires libpq for compiling from source, so install accordingly.

Timezone environment variable for tidyverse (relevant for WSL2)

One of the R packages, tidyverse might need access to the TZ environment variable during the installation. On Ubuntu 20.04 on WSL2 this triggers the following error:

> install.packages('tidyverse')

ERROR: configuration failed for package xml2
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down
Warning in system("timedatectl", intern = TRUE) :
  running command 'timedatectl' had status 1
Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) :
  namespace xml2 1.3.1 is already loaded, but >= 1.3.2 is required
Calls: <Anonymous> ... namespaceImportFrom -> asNamespace -> loadNamespace
Execution halted
ERROR: lazy loading failed for package tidyverse

This happens because WSL2 does not use the timedatectl service, which provides this variable.

~$ timedatectl
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down

and later

Warning message:
In system("timedatectl", intern = TRUE) :
  running command 'timedatectl' had status 1
Execution halted

This can be amended by setting the environment variable manually before attempting to install tidyverse:

TZ='Europe/Ljubljana'

Possible runtime issues

Unix end of line characters

Upon running rapids, an error might occur:

/usr/bin/env: python3\r: No such file or directory

This is due to Windows style end of line characters. To amend this, I added a .gitattributes files to force git to checkout rapids using Unix EOL characters. If this still fails, dos2unix can be used to change them.

System has not been booted with systemd as init system (PID 1)

See the installation issue above.