This following is documentation of on the RAPIDS metrics settings in the configuation file.
.._sensor-list:
-``SENSORS`` - This varable stores a list of the names of the sensor data that are being pulled from the AWARE_ database. These names are the actual names of the tables that the data is found in the database. See SENSORS_ variable in ``config`` file.
.._fitbit-table:
-``FITBIT_TABLE`` - The name of the fitbit database
.._fitbit-sensors:
-``FITBIT_SENSORS`` - The list of sensors that to be pulled from the fitbit database
.._pid:
-``PID`` - The list of participant ids included in the analysis. Remember that you must create a file named ``pXXX`` for each participant in the ``data/external`` directory containing there device_id. (Remember installation :ref:`step 8 <install-step-8>`)
.._day-segments:
-``DAY_SEGMENTS`` - The list of common day segments (time frequency/checkpoints) that data would be analyzed. See DAY_SEGMENTS_ in ``config`` file.
.._timezone:
-``TIMEZONE`` - The timezone of the server. Use the timezone names from this `List of Timezones`_. Double check your code, for example EST is not US Eastern Time.
.._database_group:
-``DATABASE_GROUP`` - The name of the research project database.
.._download-dataset:
-``DOWNLOAD_DATASET`` - The name of the dataset for the research project.
.._readable-datetime:
-``READABLE_DATETIME`` - Readable datetime configuration. Defines the format that the readable date and time should be.
.._phone-valid-sensed-days:
-``PHONE_VALID_SENSED_DAYS`` - Specifies the ``BIN_SIZE``, ``MIN_VALID_HOURS``, ``MIN_BINS_PER_HOUR``. ``BIN_SIZE`` is the time that the data is aggregated. ``MIN_VALID_HOURS`` is the minimum numbers of hours data will be gathered within a 24 hour period (a day). Finally ``MIN_BINS_PER_HOUR`` specifies minimum number of bins that are captured per hour. This is out of the total possible number of bins that can be captured in an hour i.e. out of 60min/``BIN_SIZE`` bins. See PHONE_VALID_SENSED_DAYS_ in ``config`` file.
.._individual-sensor-settings:
List of Indvidual Sensors and There Settings
---------------------------------------------
.._sms-sensor-doc:
SMS
"""""
See `SMS Config Code`_
**Available Epochs:**
- daily
- morning
- afternoon
- evening
- night
**Available Platforms:**
- Android
**Snakefile Entry:**
- Download raw SMS dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),``
- Download raw SMS dataset with readable: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),``
-``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule.
-``src/data/download_dataset.R`` - See the download_dataset.R_ script.
-``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule.
-``src/data/readable_datetime.R`` - See the readable_datetime.R_ script.
-``rules/features.snakefile/call_metrics`` - See the call_metrics_ rule.
-``src/features/call_metrics.R`` - See the call_metrics.R_ script.
.._calls-parameters:
**Sensor Rule Parameters:**
============ ===================
Name Description
============ ===================
call_type The particular ``call_type`` that will be analyzed. The options for this parameter are ``incoming``, ``outgoing`` or ``missed``.
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``,
``evening``, ``night``
metrics The different measures that can be retrieved from the calls dataset. Note that the same metrics are available for both
``incoming`` and ``outgoing`` calls, while ``missed`` calls has its own set of metrics. See :ref:`Available Incoming and Outgoing Call Metrices <available-in-and-out-call-metrics>` Table and :ref:`Available Missed Call Metrices <available-missed-call-metrics>` Table below.
============ ===================
.._available-in-and-out-call-metrics:
**Available Incoming and Outgoing Call Metrices**
The following table shows a list of the available metrics for ``incoming`` and ``outgoing`` calls.
========================= ========= =============
Name Units Description
========================= ========= =============
count calls A count of the number of times that a particular ``call_type`` occured for a particular ``day_segment``.
distinctcontacts contacts A count of distinct contacts that were comunicated with for a particular ``call_type`` for a particular
``day_segment``
meanduration minutes The mean duration of all calls for a particular ``call_type`` and ``day_segment``.
sumduration minutes The sum of the duration of all calls for a particular ``call_type`` and ``day_segment``.
minduration minutes The duration of the shortest call for a particular ``call_type`` and ``day_segment``.
maxduration minutes The duration of the longest call for a particular ``call_type`` and ``day_segment``.
stdduration minutes The standard deviation of all the calls for a particular ``call_type`` and ``day_segment``.
modeduration minutes The mode duration of all the calls for a particular ``call_type`` and ``day_segment``.
hubermduration The generalized Huber M-estimator of location of the MAD for the durations of all the calls for a
particular ``call_type`` and ``day_segment``.
varqnduration The location-Free Scale Estimator Qn of the durations of all the calls for a particular ``call_type``
and ``day_segment``.
entropyduration The estimates the Shannon entropy H of the durations of all the calls for a particular ``call_type``
and ``day_segment``.
timefirstcall minutes The time in minutes from 12:00am (Midnight) that the first of ``call_type`` occured.
timelastcall minutes The time in minutes from 12:00am (Midnight) that the last of ``call_type`` occured.
countmostfrequentcontact calls The count of the number of calls of a particular ``call_type`` and ``day_segment`` for the most contacted contact.
========================= ========= =============
.._available-missed-call-metrics:
**Available Missed Call Metrices**
The following table shows a list of the available metrics for ``missed`` calls.
========================= ========= =============
Name Units Description
========================= ========= =============
count calls A count of the number of times a ``missed`` call occured for a particular ``day_segment``.
distinctcontacts contacts A count of distinct contacts whose calls were ``missed``.
timefirstcall minutes The time in minutes from 12:00am (Midnight) that the first ``missed`` call occured.
timelastcall minutes The time in minutes from 12:00am (Midnight) that the last ``missed`` call occured.
countmostfrequentcontact SMS The count of the number of ``missed`` calls for the contact with the most ``missed`` calls.
========================= ========= =============
Assumptions/Observations:
#.``TYPES`` and ``METRICS`` keys need to match. From example::
-``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule.
-``src/data/download_dataset.R`` See the download_dataset.R_ script.
-``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule.
-``src/data/readable_datetime.R`` See the readable_datetime.R_ script.
-``rules/features.snakefile/bluetooth_metrics`` - See the bluetooth_metric_ rule.
-``src/features/bluetooth_metrics.R`` - See the bluetooth_metrics.R_ script.
.._bluetooth-parameters:
**Bluetooth Rule Parameters:**
============ ===================
Name Description
============ ===================
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``,
``evening``, ``night``
metrics The different measures that can be retrieved from the Bluetooth dataset. See :ref:`Available Bluetooth Metrices <bluetooth-available-metrics>` Table below
============ ===================
.._bluetooth-available-metrics:
**Available Bluetooth Metrics**
The following table shows a list of the available metrics for Bluetooth.
Barnett’s location features are based on the concept of flights and pauses. GPS coordinates are converted into a sequence of flights (straight line movements) and pauses (time spent stationary). Data is imputed before metrics are computed (https://arxiv.org/abs/1606.06328)
- Time at home. Time spent at home in minutes. Home is the most visited significant location between 8 pm and 8 am including any pauses within a 200-meter radius.
- Max home distance. Maximum distance from home in meters.
- Pause probability. The fraction of a day spent in a pause (as opposed to a flight)
- Circadian routine. A continuous metric that can take any value between 0 and 1, where 0 represents a daily routine completely different from any other sensed days and 1 a routine the same as every other sensed day.
- Wkn circadian routine. Same as Circadian routine but computed separately for weekends and weekdays.
- Distance travelled. Total distance travelled over a day.
- Radius of Gyration (RoG). It is a measure in meters of the area covered by a person over a day. A centroid is calculated for all the places (pauses) visited during a day and a weighted distance between all places and the centroid is computed. The weights are proportional to the time spent in each place.
- Maximum diameter. Largest distance in meters between any two pauses.
- Avg flight duration. Mean duration of all flights.
- Avg flight length. Mean length of all flights
- Std flight duration. The standard deviation of the duration of all flights.
- Std flight length. The standard deviation of the length of all flights.
- Significant locations. The number of significant locations visited during the day. Significant locations are computed using k-means clustering over pauses found in the whole monitoring period. The number of clusters is found iterating from 1 to 200 stopping until the centroids of two significant locations are within 400 meters of one another.
- Significant location entropy. Entropy measurement based on the proportion of time spent at each significant location visited during a day.
Available epochs: daily, morning, afternoon, evening, and night
Notes. An unlock episode is considered as the time between an unlock event and a lock event. iOS recorded these episodes reliable (albeit duplicated lock events within milliseconds from each other). However, in Android there are multiple consecutive unlock/lock events so we keep the closest pair. This happens because ACTION_SCREEN_OFF and ON are "sent when the device becomes non-interactive which may have nothing to do with the screen turning off" see this link
- Count on: count of screen on events (only available for Android)
- Count unlock: count of screen unlock events
- Diff count on off: For debug purposes, on and off events should come in pairs, difference should be close to zero then.
- Diff count unlock lock, For debug purposes, unlock and lock events should come in pairs, difference should be close to zero then.
- Sum duration unlock: sum duration of unlock episodes
- Max duration unlock: maximum duration of unlock episodes
- Min duration unlock: minimum duration of unlock episodes
- Average duration unlock: average duration of unlock episodes
- Std duration unlock: standard deviation of the duration of unlock episodes
Available epochs: daily, morning, afternoon, evening, and night
Notes. eart rate zones contain 4 zones: out_of_range zone, fat_burn zone, cardio zone, and peak zone. Please refer to the [Fitbit documentation](https://help.fitbit.com/articles/en_US/Help_article/1565) for the detailed informations of how to define those zones.
- Max hr: maximum heart rate
- Min hr: minimum heart rate
- Avg hr: average heart rate
- Median hr: median heart rate
- Mode hr: mode heart rate
- Std hr: standard deviation of heart rate
- Diff max mode hr: maximum heart rate minus mode heart rate
- Diff min mode hr: mode heart rate minus minimum heart rate
- Entropy hr: entropy of heart rate
- Length out of range: duration of heart rate in out_of_range zone in minute
- Length fat burn: duration of heart rate in fat_burn zone in minute
- Length cardio: duration of heart rate in cardio zone in minute
- Length peak: duration of heart rate in peak zone in minute
Available epochs: daily, morning, afternoon, evening, and night
Notes. If the step count per minute smaller than the THRESHOLD_ACTIVE_BOUT (default value is 10), it is defined as sedentary status. Otherwise, it is defined as active status. One active/sedentary bout is a period during with the user is under active/sedentary status.
- Sum all steps: total step count
- Max all steps: maximum step count
- Min all steps: minimum step count
- Avg all steps: average step count
- Std all steps: standard deviation of step count
- Count sedentary bout: count of sedentary bouts
- Max duration sedentary bout: maximum duration of sedentary bouts
- Min duration sedentary bout: minimum duration of sedentary bouts
- Avg duration sedentary bout: average duration of sedentary bouts
- Std duration sedentary bout: standard deviation of the duration of sedentary bouts
- Count active bout: count of active bouts
- Max duration active bout: maximum duration of active bouts
- Min duration active bout: minimum duration of active bouts
- Avg duration active bout: average duration of active bouts
- Std duration active bout: standard deviation of the duration of active bouts