**Error in .local(drv, \...) :** **Failed to connect to database: Error:
+Problem
**Error in .local(drv, \...) :** **Failed to connect to database: Error:
Can\'t initialize character set unknown (path: compiled\_in)** :
Calls: dbConnect -> dbConnect -> .local -> .Call
Execution halted
[Tue Mar 1019:40:15 2020]
-Error in rule download_dataset:
+Error in rule download_dataset:
jobid: 531
output: data/raw/p60/locations_raw.csv
RuleException:
-CalledProcessError in line 20 of /home/ubuntu/rapids/rules/preprocessing.snakefile:
+CalledProcessError in line 20 of /home/ubuntu/rapids/rules/preprocessing.snakefile:
Command 'set -euo pipefail; Rscript --vanilla /home/ubuntu/rapids/.snakemake/scripts/tmp_2jnvqs7.download_dataset.R' returned non-zero exit status 1.
-File "/home/ubuntu/rapids/rules/preprocessing.snakefile", line 20, in __rule_download_dataset
-File "/home/ubuntu/anaconda3/envs/moshi-env/lib/python3.7/concurrent/futures/thread.py", line 57, in run
+File "/home/ubuntu/rapids/rules/preprocessing.snakefile", line 20, in __rule_download_dataset
+File "/home/ubuntu/anaconda3/envs/moshi-env/lib/python3.7/concurrent/futures/thread.py", line 57, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
@@ -1987,7 +2031,7 @@ Exiting because a job execution failed. Look above for er
Error Table XXX doesn't exist while running the download_phone_data or download_fitbit_data rule.¶
-Problem
Error in .local(conn, statement, ...) :
+Problem
Error in .local(conn, statement, ...) :
could not run statement: Table 'db_name.table_name' doesn't exist
Calls: colnames ... .local -> dbSendQuery -> dbSendQuery -> .local -> .Call
Execution halted
@@ -2110,7 +2154,7 @@ ERROR: configuration failed for package Problem
You get the following error:
CondaMultiError: CondaVerificationError: The package for tk located at /home/ubuntu/miniconda2/pkgs/tk-8.6.9-hed695b0_1003
appears to be corrupted. The path 'include/mysqlStubs.h'
- specified in the package manifest cannot be found.
+ specified in the package manifest cannot be found.
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge/linux-64::llvm-openmp-10.0.0-hc9558a2_0, anaconda/linux-64::intel-openmp-2019.4-243
path: 'lib/libiomp5.so'
@@ -2120,8 +2164,8 @@ ClobberError: This transaction has incompatible packages due to a shared path.
You get the following error when downloading sensor data:
-
Error in result_fetch(res@ptr, n= n) :
- embedded nul in string:
+
Error in result_fetch(res@ptr, n= n) :
+ embedded nul in string:
Solution
This problem is due to the way RMariaDB handles a mismatch between data types in R and MySQL (see this issue). Since it seems this problem won’t be handled by RMariaDB, you have two options:
@@ -2141,7 +2185,7 @@ ClobberError: This transaction has incompatible packages due to a shared path.
A data stream is a set of sensor data collected using a specific type of device with a specific format and stored in a specific container. RAPIDS is agnostic to data streams’ formats and container; see the Data Streams Introduction for a list of supported streams.
A container is queried with an R or Python script that connects to the database, API or file where your stream’s raw data is stored.
A data stream is a set of sensor data collected using a specific type of device with a specific format and stored in a specific container.
For example, the aware_mysql data stream handles smartphone data (device) collected with the AWARE Framework (format) stored in a MySQL database (container). Similarly, smartphone data collected with Beiwe will have a different format and could be stored in a container like a PostgreSQL database or a CSV file.
This data stream handles Fitbit sensor data downloaded using the Fitbit Web API and stored in a CSV file. Please note that RAPIDS cannot query the API directly; you need to use other available tools or implement your own. Once you have your sensor data in a CSV file, RAPIDS can process it.
This data stream handles Fitbit sensor data downloaded using the Fitbit Web API and stored in a MySQL database. Please note that RAPIDS cannot query the API directly; you need to use other available tools or implement your own. Once you have your sensor data in a MySQL database, RAPIDS can process it.
This data stream handles Fitbit sensor data downloaded using the Fitbit Web API, parsed, and stored in a CSV file. Please note that RAPIDS cannot query the API directly; you need to use other available tools or implement your own. Once you have your parsed sensor data in a CSV file, RAPIDS can process it.
This data stream handles Fitbit sensor data downloaded using the Fitbit Web API, parsed, and stored in a MySQL database. Please note that RAPIDS cannot query the API directly; you need to use other available tools or implement your own. Once you have your parsed sensor data in a MySQL database, RAPIDS can process it.
We use mkdocs with the material theme to write these docs. Whenever you make any changes, just push them back to the repo and the documentation will be deployed automatically.
Along with the continued development and the addition of new sensors and features to the RAPIDS pipeline, tests for the currently available sensors and features are being implemented. Since this is a Work In Progress this page will be updated with the list of sensors and features for which testing is available. For each of the sensors listed a description of the data used for testing (test cases) are outline. Currently for all intent and testing purposes the tests/data/raw/test01/ contains all the test data files for testing android data formats and tests/data/raw/test02/ contains all the test data files for testing iOS data formats. It follows that the expected (verified output) are contained in the tests/data/processed/test01/ and tests/data/processed/test02/ for Android and iOS respectively. tests/data/raw/test03/ and tests/data/raw/test04/ contain data files for testing empty raw data files for android and iOS respectively.
The following is a list of the sensors that testing is currently available.
A behavioral feature is a metric computed from raw sensor data quantifying the behavior of a participant. For example, the time spent at home computed based on location data. These are also known as digital biomarkers.
RAPIDS’ config.yaml has a section for each supported device/sensor (e.g., PHONE_ACCELEROMETER, FITBIT_STEPS, EMPATICA_HEARTRATE). These sections follow a similar structure, and they can have one or more feature PROVIDERS, that compute one or more behavioral features. You will modify the parameters of these PROVIDERS to obtain features from different mobile sensors. We’ll use PHONE_ACCELEROMETER as an example to explain this further.
We use Fitbit heart rate intraday data to extract data yield features. Fitbit data yield features can be used to remove rows (time segments) that do not contain enough Fitbit data. You should decide what is your “enough” threshold depending on the time a participant was supposed to be wearing their Fitbit, the length of your study, and the rates of missing data that your analysis could handle.
@@ -1821,9 +1865,9 @@
We recommend using ratiovalidyieldedminutes on time segments that are shorter than two or three hours and ratiovalidyieldedhours for longer segments. This is because relying on yielded minutes only can be misleading when a big chunk of those missing minutes are clustered together.
For example, let’s assume we are working with a 24-hour time segment that is missing 12 hours of data. Two extreme cases can occur:
-
the 12 missing hours are from the beginning of the segment or
-
30 minutes could be missing from every hour (24 * 30 minutes = 12 hours).
-
+
the 12 missing hours are from the beginning of the segment or
+
30 minutes could be missing from every hour (24 * 30 minutes = 12 hours).
+
ratiovalidyieldedminutes would be 0.5 for both a and b (hinting the missing circumstances are similar). However, ratiovalidyieldedhours would be 0.5 for a and 1.0 for b if [MINUTE_RATIO_THRESHOLD_FOR_VALID_YIELDED_HOURS] is between [0.0 and 0.49] (hinting that the missing circumstances might be more favorable for b. In other words, sensed data for b is more evenly spread compared to a.
Sensor parameters description for [FITBIT_SLEEP_INTRADAY]:
@@ -2129,9 +2173,9 @@
How do we assign sleep episodes to specific dates?
START_TIME and LENGTH control the dates that sleep episodes belong to. For a pair of [START_TIME] and [LENGTH], sleep episodes (blue boxes) can only be placed at the following places:
+
+ Relationship between sleep episodes and the given times([START_TIME], [LENGTH])
+
If the end time of a sleep episode is before [START_TIME], it will belong to the day before its start date (e.g. sleep episode #1).
Sensor parameters description for [PHONE_BLUETOOTH]:
@@ -2039,6 +2083,7 @@ least frequent across dataset: '4DC7A22D-9F1F-4DEF-8576-0
+
@@ -2118,10 +2163,10 @@ least frequent across dataset: '4DC7A22D-9F1F-4DEF-8576-0
This is a combinatorial sensor which means that we use the data from multiple sensors to extract data yield features. Data yield features can be used to remove rows (time segments) that do not contain enough data. You should decide what is your “enough” threshold depending on the type of sensors you collected (frequency vs event based, e.g. acceleroemter vs calls), the length of your study, and the rates of missing data that your analysis could handle.
@@ -1845,9 +1889,9 @@
We recommend using ratiovalidyieldedminutes on time segments that are shorter than two or three hours and ratiovalidyieldedhours for longer segments. This is because relying on yielded minutes only can be misleading when a big chunk of those missing minutes are clustered together.
For example, let’s assume we are working with a 24-hour time segment that is missing 12 hours of data. Two extreme cases can occur:
-
the 12 missing hours are from the beginning of the segment or
-
30 minutes could be missing from every hour (24 * 30 minutes = 12 hours).
-
+
the 12 missing hours are from the beginning of the segment or
+
30 minutes could be missing from every hour (24 * 30 minutes = 12 hours).
+
ratiovalidyieldedminutes would be 0.5 for both a and b (hinting the missing circumstances are similar). However, ratiovalidyieldedhours would be 0.5 for a and 1.0 for b if [MINUTE_RATIO_THRESHOLD_FOR_VALID_YIELDED_HOURS] is between [0.0 and 0.49] (hinting that the missing circumstances might be more favorable for b. In other words, sensed data for b is more evenly spread compared to a.
Sensor parameters description for [PHONE_LOCATIONS]:
@@ -2193,6 +2237,7 @@ Home is calculated using all location data of a participant between 12 am and 6
+
@@ -2272,10 +2317,10 @@ Home is calculated using all location data of a participant between 12 am and 6