pull/108/head v0.3.1
JulioV 2020-12-21 16:30:46 -05:00
parent 29e3d9bf37
commit 2b1f3f230c
15 changed files with 74 additions and 23 deletions

View File

@ -1,7 +1,7 @@
name: docker
on:
release:
types: [published, edited]
types: [published, edited, released]
jobs:
main:
runs-on: ubuntu-20.04

View File

@ -1,6 +1,12 @@
# Change Log
## 0.3.0
## v0.3.1
- Update installation docs for RAPIDS' docker container
- Fix example analysis use of accelerometer data in a plot
- Update FAQ
- Update minimal example documentation
- Minor doc updates
## v0.3.0
- Update R and Python virtual environments
- Add GH actions CI support for tests and docker
- Add release and test badges to README

View File

@ -207,10 +207,21 @@
This problem is due to the way `RMariaDB` handles a mismatch between data types in R and MySQL (see [this issue](https://github.com/r-dbi/RMariaDB/issues/121)). Since it seems this problem won't be handled by `RMariaDB`, you have two options:
1. If it's only a few rows that are causing this problem, remove the the null character from the conflictive table cell.
2. If it's not feasible to modify your data you can try swapping `RMariaDB` with `RMySQL`. Just have in mind you might have problems connecting to modern MySQL servers running in Liunx:
2. If it's not feasible to modify your data you can try swapping `RMariaDB` with `RMySQL`. Just have in mind you might have problems connecting to modern MySQL servers running in Linux:
- Add `RMySQL` to the renv environment by running the following command in a terminal open on RAPIDS root folder
```bash
R -e 'renv::install("RMySQL")'
```
- Go to `src/data/download_phone_data.R` and replace `library(RMariaDB)` with `library(RMySQL)`
- In the same file replace `dbEngine <- dbConnect(MariaDB(), default.file = "./.env", group = group)` with `dbEngine <- dbConnect(MySQL(), default.file = "./.env", group = group)`
- Go to `src/data/download_phone_data.R` or `src/data/download_fitbit_data.R` and replace `library(RMariaDB)` with `library(RMySQL)`
- In the same file replace `dbEngine <- dbConnect(MariaDB(), default.file = "./.env", group = group)` with `dbEngine <- dbConnect(MySQL(), default.file = "./.env", group = group)`
## There is no package called `RMariaDB`
???+ failure "Problem"
You get the following error when executing RAPIDS:
```bash
Error in library(RMariaDB) : there is no package called 'RMariaDB'
Execution halted
```
???+ done "Solution"
In RAPIDS v0.1.0 we replaced `RMySQL` R package with `RMariaDB`, this error means your R virtual environment is out of date, to update it run `snakemake -j1 renv_restore`

View File

@ -21,6 +21,7 @@ We provide examples of the input format that RAPIDS expects, note that both exam
|a748ee1a-1d0b-4ae9-9074-279a2b6ba524 |{"activities-heart":[{"dateTime":"2020-10-09","value":{"customHeartRateZones":[],"heartRateZones":[{"caloriesOut":750.3615,"max":77,"min":30,"minutes":851,"name":"Out of Range"},{"caloriesOut":734.1516,"max":107,"min":77,"minutes":550,"name":"Fat Burn"},{"caloriesOut":131.8579,"max":130,"min":107,"minutes":29,"name":"Cardio"},{"caloriesOut":0,"max":220,"min":130,"minutes":0,"name":"Peak"}],"restingHeartRate":69}}],"activities-heart-intraday":{"dataset":[{"time":"00:00:00","value":90},{"time":"00:01:00","value":89},{"time":"00:02:00","value":88},...],"datasetInterval":1,"datasetType":"minute"}}
=== "PLAIN_TEXT"
All columns are mandatory, however, all except `device_id` and `local_date_time` can be empty if you don't have that data. Just have in mind that some features will be empty if some of these columns are empty.
|device_id |local_date_time |heartrate |heartrate_zone |
|-------------------------------------- |---------------------- |--------- |--------------- |

View File

@ -21,6 +21,7 @@ We provide examples of the input format that RAPIDS expects, note that both exam
|a748ee1a-1d0b-4ae9-9074-279a2b6ba524 |{"activities-heart":[{"dateTime":"2020-10-09","value":{"customHeartRateZones":[],"heartRateZones":[{"caloriesOut":750.3615,"max":77,"min":30,"minutes":851,"name":"Out of Range"},{"caloriesOut":734.1516,"max":107,"min":77,"minutes":550,"name":"Fat Burn"},{"caloriesOut":131.8579,"max":130,"min":107,"minutes":29,"name":"Cardio"},{"caloriesOut":0,"max":220,"min":130,"minutes":0,"name":"Peak"}],"restingHeartRate":69}}],"activities-heart-intraday":{"dataset":[{"time":"00:00:00","value":90},{"time":"00:01:00","value":89},{"time":"00:02:00","value":88},...],"datasetInterval":1,"datasetType":"minute"}}
=== "PLAIN_TEXT"
All columns are mandatory, however, all except `device_id` and `local_date_time` can be empty if you don't have that data. Just have in mind that some features will be empty if some of these columns are empty.
|device_id |local_date_time |heartrate_daily_restinghr |heartrate_daily_caloriesoutofrange |heartrate_daily_caloriesfatburn |heartrate_daily_caloriescardio |heartrate_daily_caloriespeak |
|-------------------------------------- |----------------- |------- |-------------- |------------- |------------ |-------|

View File

@ -22,6 +22,8 @@ We provide examples of the input format that RAPIDS expects, note that both exam
=== "PLAIN_TEXT"
All columns are mandatory, however, all except `device_id` and `local_date_time` can be empty if you don't have that data. Just have in mind that some features will be empty if some of these columns are empty.
|device_id |local_start_date_time |local_end_date_time |efficiency |minutes_after_wakeup |minutes_asleep |minutes_awake |minutes_to_fall_asleep |minutes_in_bed |is_main_sleep |type |count_awake |duration_awake |count_awakenings |count_restless |duration_restless |
|-------------------------------------- |---------------------- |---------------------- |----------- |--------------------- |--------------- |-------------- |----------------------- |--------------- |-------------- |-------- |----------- |--------------- |----------------- |--------------- |------------------ |
|a748ee1a-1d0b-4ae9-9074-279a2b6ba524 |2020-10-07 15:55:00 |2020-10-07 18:10:00 |91 |0 |123 |12 |0 |135 |1 |classic |2 |3 |10 |8 |9 |
@ -40,6 +42,7 @@ We provide examples of the input format that RAPIDS expects, note that both exam
|a748ee1a-1d0b-4ae9-9074-279a2b6ba524 |{"sleep":[{"dateOfSleep":"2020-10-12","duration":28980000,"efficiency":93,"endTime":"2020-10-12T09:34:30.000","infoCode":0,"isMainSleep":true,"levels":{"data":[{"dateTime":"2020-10-12T01:31:00.000","level":"wake","seconds":600},{"dateTime":"2020-10-12T01:41:00.000","level":"light","seconds":60},{"dateTime":"2020-10-12T01:42:00.000","level":"deep","seconds":2340},...], "summary":{"deep":{"count":4,"minutes":63,"thirtyDayAvgMinutes":59},"light":{"count":27,"minutes":257,"thirtyDayAvgMinutes":364},"rem":{"count":5,"minutes":94,"thirtyDayAvgMinutes":58},"wake":{"count":24,"minutes":69,"thirtyDayAvgMinutes":95}}},"logId":26589710673,"minutesAfterWakeup":0,"minutesAsleep":415,"minutesAwake":68,"minutesToFallAsleep":0,"startTime":"2020-10-12T01:31:00.000","timeInBed":483,"type":"stages"}],"summary":{"stages":{"deep":63,"light":257,"rem":94,"wake":69},"totalMinutesAsleep":415,"totalSleepRecords":1,"totalTimeInBed":483}}
=== "PLAIN_TEXT"
All columns are mandatory, however, all except `device_id` and `local_date_time` can be empty if you don't have that data. Just have in mind that some features will be empty if some of these columns are empty.
|device_id |local_start_date_time |local_end_date_time |efficiency |minutes_after_wakeup |minutes_asleep |minutes_awake |minutes_to_fall_asleep |minutes_in_bed |is_main_sleep |type |
|-------------------------------------- |---------------------- |---------------------- |----------- |--------------------- |--------------- |-------------- |----------------------- |--------------- |-------------- |-------- |

View File

@ -21,6 +21,7 @@ We provide examples of the input format that RAPIDS expects, note that both exam
|a748ee1a-1d0b-4ae9-9074-279a2b6ba524 |"activities-steps":[{"dateTime":"2020-10-09","value":"998"}],"activities-steps-intraday":{"dataset":[{"time":"00:00:00","value":0},{"time":"00:01:00","value":0},{"time":"00:02:00","value":0},...],"datasetInterval":1,"datasetType":"minute"}}
=== "PLAIN_TEXT"
All columns are mandatory.
|device_id |local_date_time |steps |
|-------------------------------------- |---------------------- |--------- |

View File

@ -21,6 +21,7 @@ We provide examples of the input format that RAPIDS expects, note that both exam
|a748ee1a-1d0b-4ae9-9074-279a2b6ba524 |"activities-steps":[{"dateTime":"2020-10-09","value":"998"}],"activities-steps-intraday":{"dataset":[{"time":"00:00:00","value":0},{"time":"00:01:00","value":0},{"time":"00:02:00","value":0},...],"datasetInterval":1,"datasetType":"minute"}}
=== "PLAIN_TEXT"
All columns are mandatory.
|device_id |local_date_time |steps |
|-------------------------------------- |---------------------- |--------- |

View File

@ -0,0 +1,9 @@
"_id","timestamp","device_id","call_type","call_duration","trace"
1,1587663260695,"a748ee1a-1d0b-4ae9-9074-279a2b6ba524",2,14,"d5e84f8af01b2728021d4f43f53a163c0c90000c"
2,1587739118007,"a748ee1a-1d0b-4ae9-9074-279a2b6ba524",3,0,"47c125dc7bd163b8612cdea13724a814917b6e93"
5,1587746544891,"a748ee1a-1d0b-4ae9-9074-279a2b6ba524",2,95,"9cc793ffd6e88b1d850ce540b5d7e000ef5650d4"
6,1587911379859,"a748ee1a-1d0b-4ae9-9074-279a2b6ba524",2,63,"51fb9344e988049a3fec774c7ca622358bf80264"
7,1587992647361,"a748ee1a-1d0b-4ae9-9074-279a2b6ba524",3,0,"2a862a7730cfdfaf103a9487afe3e02935fd6e02"
8,1588020039448,"a748ee1a-1d0b-4ae9-9074-279a2b6ba524",1,11,"a2c53f6a086d98622c06107780980cf1bb4e37bd"
11,1588176189024,"a748ee1a-1d0b-4ae9-9074-279a2b6ba524",2,65,"56589df8c830c70e330b644921ed38e08d8fd1f3"
12,1588197745079,"a748ee1a-1d0b-4ae9-9074-279a2b6ba524",3,0,"cab458018a8ed3b626515e794c70b6f415318adc"
1 _id timestamp device_id call_type call_duration trace
2 1 1587663260695 a748ee1a-1d0b-4ae9-9074-279a2b6ba524 2 14 d5e84f8af01b2728021d4f43f53a163c0c90000c
3 2 1587739118007 a748ee1a-1d0b-4ae9-9074-279a2b6ba524 3 0 47c125dc7bd163b8612cdea13724a814917b6e93
4 5 1587746544891 a748ee1a-1d0b-4ae9-9074-279a2b6ba524 2 95 9cc793ffd6e88b1d850ce540b5d7e000ef5650d4
5 6 1587911379859 a748ee1a-1d0b-4ae9-9074-279a2b6ba524 2 63 51fb9344e988049a3fec774c7ca622358bf80264
6 7 1587992647361 a748ee1a-1d0b-4ae9-9074-279a2b6ba524 3 0 2a862a7730cfdfaf103a9487afe3e02935fd6e02
7 8 1588020039448 a748ee1a-1d0b-4ae9-9074-279a2b6ba524 1 11 a2c53f6a086d98622c06107780980cf1bb4e37bd
8 11 1588176189024 a748ee1a-1d0b-4ae9-9074-279a2b6ba524 2 65 56589df8c830c70e330b644921ed38e08d8fd1f3
9 12 1588197745079 a748ee1a-1d0b-4ae9-9074-279a2b6ba524 3 0 cab458018a8ed3b626515e794c70b6f415318adc

View File

@ -32,12 +32,12 @@ When you are done with this configuration, go to [executing RAPIDS](../execution
!!! warning
The label `MY_GROUP` is arbitrary but it has to match the following `config.yaml` key:
```yaml
DATABASE_GROUP: &database_group
MY_GROUP
```
!!! hint
If you are using RAPIDS' docker container and Docker-for-mac or Docker-for-Windows 18.03+, connect to your MySQL database using the host `host.docker.internal` instead of `127.0.0.1`
!!! note
You can ignore this step if you are only processing Fitbit data in CSV files.
---

View File

@ -8,14 +8,14 @@ You can install RAPIDS using Docker (the fastest), or native instructions for Ma
2. Pull our RAPIDS container
``` bash
docker pull agamk/rapids:latest`
docker pull moshiresearch/rapids:latest`
```
3. Run RAPIDS\' container (after this step is done you should see a
prompt in the main RAPIDS folder with its python environment active)
``` bash
docker run -it agamk/rapids:latest
docker run -it moshiresearch/rapids:latest
```
4. Pull the latest version of RAPIDS
@ -41,8 +41,8 @@ You can install RAPIDS using Docker (the fastest), or native instructions for Ma
- Install the [Remote - Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers)
- Go to the `Remote Explorer` panel on the left hand sidebar
- On the top right dropdown menu choose `Containers`
- Double click on the `agamk/rapids` container in the`CONTAINERS` tree
- A new VS Code session should open on RAPIDS main folder insidethe container.
- Double click on the `moshiresearch/rapids` container in the`CONTAINERS` tree
- A new VS Code session should open on RAPIDS main folder inside the container.
=== "MacOS"
We tested these instructions in Catalina

View File

@ -38,7 +38,7 @@ In total, our example workflow has nine steps that are in charge of sensor data
## Configure and run the analysis workflow example
1. [Install](../../setup/installation) RAPIDS
2. Configure the [user credentials](../../setup/configuration/#database-credentials) of a local or remote MySQL server with writing permissions in your `.env` file. The config file where you need to modify the `DATABASE_GROUP` is at `example_profile/example_config.yaml`.
3. Unzip the [test database](https://osf.io/skqfv/files/) to `data/external/rapids_example.sql` and run:
3. *Skip this step if you are using RAPIDS docker container*. Unzip the [test database](https://osf.io/skqfv/files/) to `data/external/rapids_example.sql` and run:
```bash
./rapids -j1 restore_sql_file --profile example_profile
```

View File

@ -1,40 +1,57 @@
Minimal Working Example
=======================
This is a quick guide for creating and running a simple pipeline to extract missing, outgoing, and incoming call features for `daily` and `night` epochs of one participant monitored on the US East coast.
This is a quick guide for creating and running a simple pipeline to extract missing, outgoing, and incoming `call` features for `daily` (`00:00:00` to `23:59:59`) and `night` (`00:00:00` to `05:59:59`) epochs of every day of data of one participant monitored on the US East coast with an Android smartphone.
!!! hint
If you don't have `call` data that you can use to try this example you can restore this [CSV file](../img/calls.csv) as a table in a MySQL database.
1. Install RAPIDS and make sure your `conda` environment is active (see [Installation](../../setup/installation))
2. Make the changes listed below for the corresponding [Configuration](../../setup/configuration) step (we provide an example of what the relevant sections in your `config.yml` will look like after you are done)
??? info "Things to change on each configuration step"
1\. Setup your database connection credentials in `.env`. We assume your credentials group is called `MY_GROUP`.
??? info "Required configuration changes"
1. **Add your [database credentials](../../setup/configuration#database-credentials).**
Setup your database connection credentials in `.env`, we assume your credentials group in the `.env` file is called `MY_GROUP`.
2\. `America/New_York` should be the default timezone
2. **Choose the [timezone of your study](../../setup/configuration#timezone-of-your-study).**
Since this example is processing data collected on the US East cost, `America/New_York` should be the configured timezone, change this according to your data.
3\. Create a participant file `p01.yaml` based on one of your participants and add `p01` to `[PIDS]` in `config.yaml`. The following would be the content of your `p01.yaml` participant file:
3. **Create your [participants files](../../setup/configuration#participant-files).**
Since we are processing data from a single participant, you only need to create a single participant file called `p01.yaml`. This participant file only has a `PHONE` section because this hypothetical participant was only monitored with an smartphone. You also need to add `p01` to `[PIDS]` in `config.yaml`. The following would be the content of your `p01.yaml` participant file:
```yaml
PHONE:
DEVICE_IDS: [aaaaaaaa-1111-bbbb-2222-cccccccccccc] # your participant's AWARE device id
DEVICE_IDS: [a748ee1a-1d0b-4ae9-9074-279a2b6ba524] # the participant's AWARE device id
PLATFORMS: [android] # or ios
LABEL: MyTestP01 # any string
START_DATE: 2020-01-01 # this can also be empty
END_DATE: 2021-01-01 # this can also be empty
```
4\. `[TIME_SEGMENTS][TYPE]` should be the default `PERIODIC`. Change `[TIME_SEGMENTS][FILE]` with the path of a file containing the following lines:
4. **Select what [time segments](../../setup/configuration#time-segments) you want to extract features on.**
`[TIME_SEGMENTS][TYPE]` should be the default `PERIODIC`. Change `[TIME_SEGMENTS][FILE]` with the path (for example `data/external/timesegments_periodic.csv`) of a file containing the following lines:
```csv
label,start_time,length,repeats_on,repeats_value
daily,00:00:00,23H 59M 59S,every_day,0
night,00:00:00,5H 59M 59S,every_day,0
```
5\. If you collected data with AWARE you won't need to modify the attributes of `[DEVICE_DATA][PHONE]`
5. **Modify your [device data source configuration](../../setup/configuration#device-data-source-configuration)**
In this example we do not need to modify this section because we are using smartphone data collected with AWARE stored on a MySQL database.
6\. Set `[PHONE_CALLS][PROVIDERS][RAPIDS][COMPUTE]` to `True`
6. **Select what [sensors and features](../../setup/configuration#sensor-and-features-to-process) you want to process.**
Set `[PHONE_CALLS][PROVIDERS][RAPIDS][COMPUTE]` to `True` in the `config.yaml` file.
??? example "Example of the `config.yaml` sections after the changes outlined above"
```
Highlighted lines are related to the configuration steps above.
``` yaml hl_lines="1 4 7 12 13 38"
PIDS: [p01]
TIMEZONE: &timezone

View File

@ -328,7 +328,7 @@ HEATMAP_SENSORS_PER_MINUTE_PER_TIME_SEGMENT:
HEATMAP_SENSOR_ROW_COUNT_PER_TIME_SEGMENT:
PLOT: True
SENSORS: [PHONE_ACCELEROMETER, PHONE_ACTIVITY_RECOGNITION, PHONE_APPLICATIONS_FOREGROUND, PHONE_BATTERY, PHONE_BLUETOOTH, PHONE_CALLS, PHONE_CONVERSATION, PHONE_LIGHT, PHONE_LOCATIONS, PHONE_MESSAGES, PHONE_SCREEN, PHONE_WIFI_CONNECTED, PHONE_WIFI_VISIBLE]
SENSORS: [PHONE_ACTIVITY_RECOGNITION, PHONE_APPLICATIONS_FOREGROUND, PHONE_BATTERY, PHONE_BLUETOOTH, PHONE_CALLS, PHONE_CONVERSATION, PHONE_LIGHT, PHONE_LOCATIONS, PHONE_MESSAGES, PHONE_SCREEN, PHONE_WIFI_CONNECTED, PHONE_WIFI_VISIBLE]
HEATMAP_PHONE_DATA_YIELD_PER_PARTICIPANT_PER_TIME_SEGMENT:
PLOT: True

1
renv/.gitignore vendored
View File

@ -1,3 +1,4 @@
lock/
library/
python/
staging/