rapids/docs/features/extracted.rst

986 lines
64 KiB
ReStructuredText
Raw Normal View History

.. _rapids_features:
RAPIDS Features
===============
2020-03-09 21:07:57 +01:00
Global Parameters
"""""""""""""""""
.. _sensor-list:
2020-03-09 21:07:57 +01:00
- ``SENSORS`` - List of sensors to include in the pipeline that have to match existent tables in your AWARE_ database. See SENSORS_ variable in ``config`` file.
.. _fitbit-table:
2020-05-29 01:26:20 +02:00
- ``FITBIT_TABLE`` - The table in your database that contains your Fitbit data in a field named `fitbit_data` in JSON format.
.. _fitbit-sensors:
2020-03-09 21:07:57 +01:00
- ``FITBIT_SENSORS`` - The list of sensors to be parsed from the fitbit table: ``heartrate``, ``steps``, ``sleep``.
.. _pid:
2020-03-09 21:07:57 +01:00
- ``PID`` - The list of participant ids to be included in the analysis. These should match the names of the files created in the ``data/external`` directory (:ref:`see more details<db-configuration>`).
.. _day-segments:
2020-05-19 18:34:45 +02:00
- ``DAY_SEGMENTS`` - The list of day epochs that features can be segmented into: ``daily``, ``morning`` (6am-12pm), ``afternnon`` (12pm-6pm), ``evening`` (6pm-12am) and ``night`` (12am-6am). This list can be modified globally or on a per sensor basis. See DAY_SEGMENTS_ in ``config`` file.
.. _timezone:
2020-05-29 01:26:20 +02:00
- ``TIMEZONE`` - The time zone where data was collected. Use the timezone names from this `List of Timezones`_. Double check your chosen name is correct, for example US Eastern Time is called New America/New_York, not EST.
.. _database_group:
2020-03-09 21:07:57 +01:00
- ``DATABASE_GROUP`` - The name of your database credentials group, it should match the one in ``.env`` (:ref:`see the datbase configuration<db-configuration>`).
.. _download-dataset:
2020-03-09 21:07:57 +01:00
- ``DOWNLOAD_DATASET``
2020-05-19 18:34:45 +02:00
- ``GROUP``. Credentials group to connect to the database containing ``SENSORS``. By default it points to ``DATABASE_GROUP``.
.. _readable-datetime:
2020-03-09 21:07:57 +01:00
- ``READABLE_DATETIME`` - Configuration to convert UNIX timestamps into readbale date time strings.
- ``FIXED_TIMEZONE``. See ``TIMEZONE`` above. This assumes that all data of all participants was collected within one time zone.
2020-05-29 01:26:20 +02:00
- Support for multiple time zones for each participant coming soon based on the ``timezone`` table collected by Aware.
.. _phone-valid-sensed-days:
- ``PHONE_VALID_SENSED_DAYS``.
Contains three attributes: ``BIN_SIZE``, ``MIN_VALID_HOURS``, ``MIN_BINS_PER_HOUR``.
2020-05-19 18:34:45 +02:00
On any given day, Aware could have sensed data only for a few minutes or for 24 hours. Daily estimates of features should be considered more reliable the more hours Aware was running and logging data (for example, 10 calls logged on a day when only one hour of data was recorded is a less reliable measurement compared to 10 calls on a day when 23 hours of data were recorded.
Therefore, we define a valid hour as those that contain at least a certain number of valid bins. In turn, a valid bin are those that contain at least one row of data from any sensor logged within that period. We divide an hour into N bins of size ``BIN_SIZE`` (in minutes) and we mark an hour as valid if contains at least ``MIN_BINS_PER_HOUR`` of valid bins (out of the total possible number of bins that can be captured in an hour i.e. out of 60min/``BIN_SIZE`` bins). Days with valid sensed hours less than ``MIN_VALID_HOURS`` will be excluded form the output of this file. See PHONE_VALID_SENSED_DAYS_ in ``config.yaml``.
2020-05-29 01:26:20 +02:00
In RAPIDS, we use ``phone_sensed_bins`` (a list of all valid and invalid bins of all monitored days) to improve the estimation of features that are ratios over time periods like ``episodepersensedminutes`` of :ref:`Screen<screen-sensor-doc>` or for resampling data like fused location coordinates.
.. _individual-sensor-settings:
.. _sms-sensor-doc:
SMS
"""""
See `SMS Config Code`_
2020-05-28 23:44:15 +02:00
**Available Epochs (day_segment) :** daily, morning, afternoon, evening, night
2020-05-28 23:35:37 +02:00
**Available Platforms:** Android
2020-05-28 23:35:37 +02:00
**Snakefile entry to compute these features:**
| ``expand("data/processed/{pid}/sms_{sms_type}_{day_segment}.csv".``
| ``pid=config["PIDS"],``
| ``sms_type = config["SMS"]["TYPES"],``
| ``day_segment = config["SMS"]["DAY_SEGMENTS"]),``
**Rule Chain:**
2020-05-28 23:35:37 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/readable_datetime``
- Rule ``rules/features.snakefile/sms_features``
.. _sms-parameters:
2020-05-28 23:44:15 +02:00
**SMS Rule Parameters (sms_features):**
============ ===================
Name Description
============ ===================
2020-03-06 19:38:05 +01:00
sms_type The particular ``sms_type`` that will be analyzed. The options for this parameter are ``received`` or ``sent``.
2020-05-19 18:34:45 +02:00
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night``
2020-05-28 23:35:37 +02:00
features Features to be computed, see table below
============ ===================
2020-05-14 22:06:13 +02:00
.. _sms-available-features:
**Available SMS Featues**
========================= ========= =============
Name Units Description
========================= ========= =============
2020-05-19 18:34:45 +02:00
count SMS Number of SMS of type ``sms_type`` that occurred during a particular ``day_segment``.
distinctcontacts contacts Number of distinct contacts that are associated with a particular ``sms_type`` during a particular ``day_segment``.
timefirstsms minutes Number of minutes between 12:00am (midnight) and the first ``SMS`` of a particular ``sms_type``.
timelastsms minutes Number of minutes between 12:00am (midnight) and the last ``SMS`` of a particular ``sms_type``.
2020-05-29 17:41:17 +02:00
countmostfrequentcontact SMS Number of ``SMS`` messages from the contact with the most messages of ``sms_type`` during a ``day_segment`` throughout the whole dataset of each participant.
========================= ========= =============
2020-03-06 19:38:05 +01:00
**Assumptions/Observations:**
2020-05-28 23:44:15 +02:00
``TYPES`` and ``FEATURES`` keys in ``config.yaml`` need to match. For example, below the ``TYPE`` ``sent`` matches the ``FEATURES`` key ``sent``::
SMS:
2020-03-06 19:38:05 +01:00
TYPES: [sent]
FEATURES:
sent: [count, distinctcontacts, timefirstsms, timelastsms, countmostfrequentcontact]
.. _call-sensor-doc:
Calls
2020-03-06 19:38:05 +01:00
""""""
See `Call Config Code`_
2020-05-28 23:44:15 +02:00
**Available Epochs (day_segment) :** daily, morning, afternoon, evening, night
2020-05-28 23:44:15 +02:00
**Available Platforms:** Android and iOS
2020-05-28 23:44:15 +02:00
**Snakefile entry to compute these features:**
| ``expand("data/processed/{pid}/call_{call_type}_{segment}.csv",``
| ``pid=config["PIDS"],``
| ``call_type=config["CALLS"]["TYPES"],``
| ``segment = config["CALLS"]["DAY_SEGMENTS"]),``
**Rule Chain:**
2020-05-28 23:44:15 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/readable_datetime``
- Rule ``rules/features.snakefile/call_features``
.. _calls-parameters:
2020-05-28 23:44:15 +02:00
**Call Rule Parameters (call_features):**
============ ===================
Name Description
============ ===================
call_type The particular ``call_type`` that will be analyzed. The options for this parameter are ``incoming``, ``outgoing`` or ``missed``.
2020-05-19 18:34:45 +02:00
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night``
2020-05-28 23:44:15 +02:00
features Features to be computed. Note that the same features are available for both ``incoming`` and ``outgoing`` calls, while ``missed`` calls has its own set of features. See :ref:`Available Incoming and Outgoing Call Features <available-in-and-out-call-features>` Table and :ref:`Available Missed Call Features <available-missed-call-features>` Table below.
============ ===================
.. _available-in-and-out-call-features:
**Available Incoming and Outgoing Call Features**
========================= ========= =============
Name Units Description
========================= ========= =============
2020-05-19 18:34:45 +02:00
count calls Number of calls of a particular ``call_type`` occurred during a particular ``day_segment``.
distinctcontacts contacts Number of distinct contacts that are associated with a particular ``call_type`` for a particular ``day_segment``
2020-05-22 19:34:46 +02:00
meanduration seconds The mean duration of all calls of a particular ``call_type`` during a particular ``day_segment``.
sumduration seconds The sum of the duration of all calls of a particular ``call_type`` during a particular ``day_segment``.
minduration seconds The duration of the shortest call of a particular ``call_type`` during a particular ``day_segment``.
maxduration seconds The duration of the longest call of a particular ``call_type`` during a particular ``day_segment``.
stdduration seconds The standard deviation of the duration of all the calls of a particular ``call_type`` during a particular ``day_segment``.
modeduration seconds The mode of the duration of all the calls of a particular ``call_type`` during a particular ``day_segment``.
2020-05-19 18:34:45 +02:00
entropyduration nats The estimate of the Shannon entropy for the the duration of all the calls of a particular ``call_type`` during a particular ``day_segment``.
timefirstcall minutes The time in minutes between 12:00am (midnight) and the first call of ``call_type``.
timelastcall minutes The time in minutes between 12:00am (midnight) and the last call of ``call_type``.
2020-05-19 18:34:45 +02:00
countmostfrequentcontact calls The number of calls of a particular ``call_type`` during a particular ``day_segment`` of the most frequent contact throughout the monitored period.
========================= ========= =============
.. _available-missed-call-features:
**Available Missed Call Features**
========================= ========= =============
Name Units Description
========================= ========= =============
2020-05-19 18:34:45 +02:00
count calls Number of ``missed`` calls that occurred during a particular ``day_segment``.
distinctcontacts contacts Number of distinct contacts that are associated with ``missed`` calls for a particular ``day_segment``
2020-05-22 19:34:46 +02:00
timefirstcall hours The time in hours from 12:00am (Midnight) that the first ``missed`` call occurred.
timelastcall hours The time in hours from 12:00am (Midnight) that the last ``missed`` call occurred.
2020-05-19 18:34:45 +02:00
countmostfrequentcontact calls The number of ``missed`` calls during a particular ``day_segment`` of the most frequent contact throughout the monitored period.
========================= ========= =============
2020-03-06 19:38:05 +01:00
**Assumptions/Observations:**
2020-05-28 23:51:03 +02:00
``TYPES`` and ``FEATURES`` keys in ``config.yaml`` need to match. For example, below the ``TYPE`` ``missed`` matches the ``FEATURES`` key ``missed``::
2020-05-28 23:44:15 +02:00
CALLS:
TYPES: [missed]
FEATURES:
missed: [count, distinctcontacts, timefirstcall, timelastcall, countmostfrequentcontact]
2020-05-28 23:51:03 +02:00
Aware Android client stores call types 1=incoming, 2=outgoing, 3=missed while Aware iOS client stores call status 1=incoming, 2=connected, 3=dialing, 4=disconnected. We extract iOS call types based on call status sequences: (1,2,4)=incoming=1, (3,2,4)=outgoing=2, (1,4) or (3,4)=missed=3. Sometimes (due to a possible bug in Aware) sequences get logged on the exact same timestamp, thus 3-item sequences can be 2,3,4 or 3,2,4. Although iOS stores the duration of ringing/dialing stages for missed calls, we set it to 0 to match Android.
.. _bluetooth-sensor-doc:
Bluetooth
""""""""""
See `Bluetooth Config Code`_
2020-05-28 23:51:03 +02:00
**Available Epochs (day_segment) :** daily, morning, afternoon, evening, night
2020-05-28 23:51:03 +02:00
**Available Platforms:** Android and iOS
2020-05-28 23:51:03 +02:00
**Snakefile entry to compute these features:**
| ``expand("data/processed/{pid}/bluetooth_{segment}.csv",``
| ``pid=config["PIDS"],``
| ``segment = config["BLUETOOTH"]["DAY_SEGMENTS"]),``
2020-05-28 23:51:03 +02:00
**Snakemake rule chain:**
2020-05-28 23:51:03 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/readable_datetime``
- Rule ``rules/features.snakefile/bluetooth_features``
.. _bluetooth-parameters:
2020-05-29 00:04:12 +02:00
**Bluetooth Rule Parameters (bluetooth_features):**
============ ===================
Name Description
============ ===================
2020-05-19 18:34:45 +02:00
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night``
2020-05-28 23:51:03 +02:00
features Features to be computed, see table below
============ ===================
.. _bluetooth-available-features:
**Available Bluetooth Features**
=========================== ========= =============
Name Units Description
=========================== ========= =============
2020-05-19 18:34:45 +02:00
countscans devices Number of scanned devices during a ``day_segment``, a device can be detected multiple times over time and these appearances are counted separately
uniquedevices devices Number of unique devices during a ``day_segment`` as identified by their hardware address
countscansmostuniquedevice scans Number of scans of the most scanned device during a ``day_segment`` across the whole monitoring period
=========================== ========= =============
2020-03-06 19:38:05 +01:00
**Assumptions/Observations:** N/A
2020-03-06 19:38:05 +01:00
.. _accelerometer-sensor-doc:
Accelerometer
2020-03-06 19:38:05 +01:00
""""""""""""""
See `Accelerometer Config Code`_
2020-05-29 00:04:12 +02:00
**Available Epochs (day_segment) :** daily, morning, afternoon, evening, night
2020-03-06 19:38:05 +01:00
2020-05-29 00:04:12 +02:00
**Available Platforms:** Android and iOS
2020-03-06 19:38:05 +01:00
2020-05-29 00:04:12 +02:00
**Snakefile entry to compute these features:**
2020-03-06 19:38:05 +01:00
| ``expand("data/processed/{pid}/accelerometer_{day_segment}.csv",``
| ``pid=config["PIDS"],``
| ``day_segment = config["ACCELEROMETER"]["DAY_SEGMENTS"]),``
**Rule chain:**
2020-05-29 00:04:12 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/readable_datetime``
- Rule ``rules/features.snakefile/accelerometer_features``
2020-03-06 19:38:05 +01:00
.. _Accelerometer-parameters:
2020-05-29 00:04:12 +02:00
**Accelerometer Rule Parameters (accelerometer_features):**
2020-03-06 19:38:05 +01:00
============ ===================
Name Description
============ ===================
2020-05-19 18:34:45 +02:00
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night``
2020-05-29 00:04:12 +02:00
features Features to be computed, see table below
2020-03-06 19:38:05 +01:00
============ ===================
.. _accelerometer-available-features:
2020-03-06 19:38:05 +01:00
**Available Accelerometer Features**
2020-03-06 19:38:05 +01:00
==================================== ============== =============
Name Units Description
==================================== ============== =============
2020-05-14 22:06:13 +02:00
maxmagnitude m/s\ :sup:`2` The maximum magnitude of acceleration (:math:`\|acceleration\| = \sqrt{x^2 + y^2 + z^2}`).
2020-03-06 19:38:05 +01:00
minmagnitude m/s\ :sup:`2` The minimum magnitude of acceleration.
avgmagnitude m/s\ :sup:`2` The average magnitude of acceleration.
medianmagnitude m/s\ :sup:`2` The median magnitude of acceleration.
stdmagnitude m/s\ :sup:`2` The standard deviation of acceleration.
ratioexertionalactivityepisodes The ratio of exertional activity time periods to total time periods.
2020-05-19 18:34:45 +02:00
sumexertionalactivityepisodes minutes Total duration of all exertional activity episodes during ``day_segment``.
longestexertionalactivityepisode minutes Duration of the longest exertional activity episode during ``day_segment``.
longestnonexertionalactivityepisode minutes Duration of the longest non-exertional activity episode during ``day_segment``.
countexertionalactivityepisodes episodes Number of the exertional activity episodes during ``day_segment``.
countnonexertionalactivityepisodes episodes Number of the non-exertional activity episodes during ``day_segment``.
2020-03-06 19:38:05 +01:00
==================================== ============== =============
2020-05-29 00:04:12 +02:00
**Assumptions/Observations:**
2020-03-06 19:38:05 +01:00
2020-05-29 00:04:12 +02:00
Exertional activity episodes are based on this paper: Panda N, Solsky I, Huang EJ, et al. Using Smartphones to Capture Novel Recovery Metrics After Cancer Surgery. JAMA Surg. 2020;155(2):123129. doi:10.1001/jamasurg.2019.4702
2020-03-06 19:38:05 +01:00
.. _applications-foreground-sensor-doc:
Applications Foreground
""""""""""""""""""""""""
See `Applications Foreground Config Code`_
2020-05-29 00:14:16 +02:00
**Available Epochs (day_segment) :** daily, morning, afternoon, evening, night
2020-03-06 19:38:05 +01:00
2020-05-29 00:14:16 +02:00
**Available Platforms:** Android
2020-03-06 19:38:05 +01:00
2020-05-29 00:14:16 +02:00
**Snakefile entry to compute these features:**
2020-03-06 19:38:05 +01:00
| ``expand("data/processed/{pid}/applications_foreground_{day_segment}.csv",``
| ``pid=config["PIDS"],``
| ``day_segment = config["APPLICATIONS_FOREGROUND"]["DAY_SEGMENTS"]),``
2020-05-29 00:14:16 +02:00
**Snakemake rule chain:**
2020-05-29 00:14:16 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/readable_datetime``
- Rule ``rules/preprocessing.snakefile/application_genres``
- Rule ``rules/features.snakefile/applications_foreground_features``
2020-03-06 19:38:05 +01:00
.. _applications-foreground-parameters:
2020-05-29 00:14:16 +02:00
**Applications Foreground Rule Parameters (applications_foreground_features):**
2020-03-06 19:38:05 +01:00
==================== ===================
Name Description
==================== ===================
2020-05-19 18:34:45 +02:00
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night``
2020-05-21 20:40:51 +02:00
single_categories App categories to be included in the feature extraction computation. See ``APPLICATION_GENRES`` in this file to add new categories or use the catalogue we provide and read :ref:`Assumtions and Observations <applications-foreground-observations>` for more information.
multiple_categories You can group multiple categories into meta categories, for example ``social: ["socialnetworks", "socialmediatools"]``.
single_apps Apps to be included in the feature extraction computation. Use their package name, for example, ``com.google.android.youtube`` or the reserved word ``top1global`` (the most used app by a participant over the whole monitoring study).
excluded_categories App categories to be excluded in the feature extraction computation. See ``APPLICATION_GENRES`` in this file to add new categories or use the catalogue we provide and read :ref:`Assumtions and Observations <applications-foreground-observations>` for more information.
excluded_apps Apps to be excluded in the feature extraction computation. Use their package name, for example: ``com.google.android.youtube``
features Features to be computed, see table below
2020-03-06 19:38:05 +01:00
==================== ===================
.. _applications-foreground-available-features:
**Available Applications Foreground Features**
2020-03-06 19:38:05 +01:00
================== ========= =============
Name Units Description
================== ========= =============
2020-05-19 18:34:45 +02:00
count apps Number of times a single app or apps within a category were used (i.e. they were brought to the foreground either by tapping their icon or switching to it from another app).
timeoffirstuse contacts The time in minutes between 12:00am (midnight) and the first use of a single app or apps within a category during a ``day_segment``.
timeoflastuse minutes The time in minutes between 12:00am (midnight) and the last use of a single app or apps within a category during a ``day_segment``.
frequencyentropy nats The entropy of the used apps within a category during a ``day_segment`` (each app is seen as a unique event, the more apps were used, the higher the entropy). This is especially relevant when computed over all apps. Entropy cannot be obtained for a single app.
2020-03-06 19:38:05 +01:00
================== ========= =============
.. _applications-foreground-observations:
**Assumptions/Observations:**
2020-05-21 20:40:51 +02:00
Features can be computed by app, by apps grouped under a single category (genre) and by multiple categories grouped together (meta categories). For example, we can get features for Facebook, for Social Network Apps (including Facebook and others) or for a meta category called Social formed by Social Network and Social Media Tools categories.
2020-03-06 19:38:05 +01:00
2020-05-21 20:40:51 +02:00
We provide three ways of classifying and app within a category (genre): a) by automatically scraping its official category from the Google Play Store, b) by using the catalogue created by Stachl et al. which we provide in RAPIDS (``data/external/``), or c) by manually creating a personalized catalogue.
2020-03-06 19:38:05 +01:00
2020-05-29 00:14:16 +02:00
The way you choose strategy a, b or c is by modifying ``APPLICATION_GENRES`` keys and values. Set ``CATALOGUE_SOURCE`` to ``FILE`` if you want to use a CSV file as catalogue (strategy b and c) or to ``GOOGLE`` if you want to scrape the genres from the Play Store (strategy a). By default ``CATALOGUE_FILE`` points to the catalogue created by Stachl et al. (strategy b) and you can change this path to your own catalogue that follows the same format (strategy c). In addition, set ``SCRAPE_MISSING_GENRES`` to true if you are using a FILE catalogue and you want to scrape from the Play Store any missing genres and ``UPDATE_CATALOGUE_FILE`` to true if you want to save those scrapped genres back into the FILE.
The genre catalogue we provide was shared as part of the Supplemental Materials of Stachl, C., Au, Q., Schoedel, R., Buschek, D., Völkel, S., Schuwerk, T., … Bühner, M. (2019, June 12). Behavioral Patterns in Smartphone Usage Predict Big Five Personality Traits. https://doi.org/10.31234/osf.io/ks4vd
2020-03-06 19:38:05 +01:00
.. _battery-sensor-doc:
2019-12-18 06:08:33 +01:00
Battery
2020-03-06 19:38:05 +01:00
"""""""""
See `Battery Config Code`_
2020-05-29 00:18:54 +02:00
**Available Epochs (day_segment) :** daily, morning, afternoon, evening, night
2020-03-06 19:38:05 +01:00
2020-05-29 00:18:54 +02:00
**Available Platforms:** Android and iOS
2020-03-06 19:38:05 +01:00
2020-05-29 00:18:54 +02:00
**Snakefile entry to compute these features:**
2020-03-06 19:38:05 +01:00
| ``expand("data/processed/{pid}/battery_{day_segment}.csv",``
| ``pid=config["PIDS"],``
| ``day_segment = config["BATTERY"]["DAY_SEGMENTS"]),``
2020-05-29 00:18:54 +02:00
**Snakemake rule chain:**
2020-03-06 19:38:05 +01:00
2020-05-29 00:18:54 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/readable_datetime``
- Rule ``rules/features.snakefile/battery_deltas``
- Rule ``rules/features.snakefile/battery_features``
2020-03-06 19:38:05 +01:00
.. _battery-parameters:
2020-05-29 00:18:54 +02:00
**Battery Rule Parameters (battery_features):**
2020-03-06 19:38:05 +01:00
============ ===================
Name Description
============ ===================
2020-05-19 18:34:45 +02:00
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night``
2020-05-29 00:18:54 +02:00
features Features to be computed, see table below
2020-03-06 19:38:05 +01:00
============ ===================
.. _battery-available-features:
2020-03-06 19:38:05 +01:00
**Available Battery Features**
2020-03-06 19:38:05 +01:00
===================== =============== =============
Name Units Description
===================== =============== =============
2020-05-19 18:34:45 +02:00
countdischarge episodes Number of discharging episodes.
sumdurationdischarge hours The total duration of all discharging episodes.
countcharge episodes Number of battery charging episodes.
sumdurationcharge hours The total duration of all charging episodes.
2020-05-14 22:06:13 +02:00
avgconsumptionrate episodes/hours The average of all episodes consumption rates. An episodes consumption rate is defined as the ratio between its battery delta and duration
maxconsumptionrate episodes/hours The highest of all episodes consumption rates. An episodes consumption rate is defined as the ratio between its battery delta and duration
2020-03-06 19:38:05 +01:00
===================== =============== =============
**Assumptions/Observations:**
2020-05-29 00:49:03 +02:00
2020-05-29 00:18:54 +02:00
For Aware iOS client V1 we swap battery status 3 to 5 and 1 to 3, client V2 does not have this problem.
2020-03-06 19:38:05 +01:00
2020-05-28 20:55:37 +02:00
.. _activity-recognition-sensor-doc:
2020-05-28 20:48:34 +02:00
Activity Recognition
2020-03-06 19:38:05 +01:00
""""""""""""""""""""""""""""
2020-05-28 20:48:34 +02:00
**Available Epochs:** daily, morning, afternoon, evening, night
2020-03-06 19:38:05 +01:00
2020-05-28 20:48:34 +02:00
**Available Platforms:** Android and iOS
2020-03-06 19:38:05 +01:00
2020-05-28 20:48:34 +02:00
**Snakefile entry to compute these features:**
2020-03-06 19:38:05 +01:00
2020-05-29 00:49:03 +02:00
| ``expand("data/processed/{pid}/activity_recognition_{segment}.csv",pid=config["PIDS"],``
| ``segment = config["ACTIVITY_RECOGNITION"]["DAY_SEGMENTS"]),``
2020-03-06 19:38:05 +01:00
2020-05-28 20:48:34 +02:00
**Snakemake rule chain:**
2020-05-28 20:55:37 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/readable_datetime``
- Rule ``rules/preprocessing.snakefile/unify_ios_android``
- Rule ``rules/features.snakefile/google_activity_recognition_deltas``
- Rule ``rules/features.snakefile/ios_activity_recognition_deltas``
- Rule ``rules/features.snakefile/activity_features``
2020-03-06 19:38:05 +01:00
2020-05-28 20:48:34 +02:00
.. _activity-recognition-parameters:
2020-03-06 19:38:05 +01:00
2020-05-28 20:55:37 +02:00
**Rule Parameters (activity_features):**
2019-12-18 06:08:33 +01:00
2020-03-06 19:38:05 +01:00
============ ===================
Name Description
============ ===================
2020-05-19 18:34:45 +02:00
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night``
2020-05-28 20:55:37 +02:00
features Features to be computed, see table below
2020-03-06 19:38:05 +01:00
============ ===================
2019-12-18 06:08:33 +01:00
2020-05-28 20:48:34 +02:00
.. _activity-recognition-available-features:
2019-12-18 06:08:33 +01:00
2020-05-28 20:48:34 +02:00
**Available Activity Recognition Features**
2020-03-06 19:38:05 +01:00
====================== ============ =============
Name Units Description
====================== ============ =============
2020-05-19 18:34:45 +02:00
count rows Number of detect activity events (rows).
mostcommonactivity factor The most common activity.
countuniqueactivities activities Number of unique activities.
activitychangecount transitions Number of transitions between two different activities; still to running for example.
2020-03-06 19:38:05 +01:00
sumstationary minutes The total duration of episodes of still and tilting (phone) activities.
summobile minutes The total duration of episodes of on foot, running, and on bicycle activities
sumvehicle minutes The total duration of episodes of on vehicle activity
====================== ============ =============
2020-05-28 20:48:34 +02:00
**Assumptions/Observations:**
2020-05-28 20:55:37 +02:00
iOS Activity Recognition data labels are unified with Google Activity Recognition labels: "automotive" to "in_vehicle", "cycling" to "on_bicycle", "walking" and "running" to "on_foot", "stationary" to "still". In addition, iOS activity pairs formed by "stationary" and "automotive" labels (driving but stopped at a traffic light) are transformed to "automotive" only.
2020-05-28 20:48:34 +02:00
2020-05-28 20:55:37 +02:00
In AWARE, Activity Recognition data for Google (Android) and iOS are stored in two different database tables, RAPIDS (via Snakemake) automatically infers what platform each participant belongs to based on their participant file (``data/external/``) which in turn takes this information from the ``aware_device`` table (see ``optional_ar_input`` function in ``rules/features.snakefile``).
2020-03-06 19:38:05 +01:00
.. _light-doc:
Light
2020-03-06 19:38:05 +01:00
"""""""
See `Light Config Code`_
2020-05-29 00:49:03 +02:00
**Available Epochs (day_segment) :** daily, morning, afternoon, evening, night
2020-03-06 19:38:05 +01:00
2020-05-29 00:49:03 +02:00
**Available Platforms:** Android
2020-03-06 19:38:05 +01:00
2020-05-29 00:49:03 +02:00
**Snakefile entry to compute these features:**
2020-03-06 19:38:05 +01:00
| ``expand("data/processed/{pid}/light_{day_segment}.csv",``
| ``pid=config["PIDS"],``
| ``day_segment = config["LIGHT"]["DAY_SEGMENTS"]),``
**Rule Chain:**
- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule.
- **Rule:** ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule.
- **Rule:** ``rules/features.snakefile/light_features`` - See the light_features_ rule.
2020-03-06 19:38:05 +01:00
.. _light-parameters:
2020-05-29 00:49:03 +02:00
**Light Rule Parameters (light_features):**
2020-03-06 19:38:05 +01:00
============ ===================
Name Description
============ ===================
2020-05-19 18:34:45 +02:00
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night``
2020-05-29 00:49:03 +02:00
features Features to be computed, see table below
2020-03-06 19:38:05 +01:00
============ ===================
.. _light-available-features:
**Available Light Features**
2020-03-06 19:38:05 +01:00
=========== ========= =============
Name Units Description
=========== ========= =============
2020-05-19 18:34:45 +02:00
count rows Number light sensor rows recorded.
maxlux lux The maximum ambient luminance.
minlux lux The minimum ambient luminance.
avglux lux The average ambient luminance.
medianlux lux The median ambient luminance.
stdlux lux The standard deviation of ambient luminance.
2020-03-06 19:38:05 +01:00
=========== ========= =============
**Assumptions/Observations:** N/A
.. _location-sensor-doc:
Location (Barnetts) Features
2020-03-06 19:38:05 +01:00
""""""""""""""""""""""""""""""
2020-05-19 18:34:45 +02:00
Barnetts location features are based on the concept of flights and pauses. GPS coordinates are converted into a
sequence of flights (straight line movements) and pauses (time spent stationary). Data is imputed before features
2020-05-29 00:49:03 +02:00
are computed. See Ian Barnett, Jukka-Pekka Onnela, Inferring mobility measures from GPS traces with missing data, Biostatistics, Volume 21, Issue 2, April 2020, Pages e98e112, https://doi.org/10.1093/biostatistics/kxy059. The code for these features was made open source by Ian Barnett (https://scholar.harvard.edu/ibarnett/software/gpsmobility).
2020-03-06 19:38:05 +01:00
See `Location (Barnetts) Config Code`_
2020-05-29 00:49:03 +02:00
**Available Epochs (day_segment) :** daily
2020-03-06 19:38:05 +01:00
2020-05-29 00:49:03 +02:00
**Available Platforms:** Android and iOS
2020-03-06 19:38:05 +01:00
2020-05-29 00:49:03 +02:00
**Snakefile entry to compute these features:**
2020-03-06 19:38:05 +01:00
2020-05-29 01:01:46 +02:00
| ``expand("data/processed/{pid}/location_barnett_{segment}.csv",``
2020-05-29 00:49:03 +02:00
| ``pid=config["PIDS"],``
| ``segment = config["BARNETT_LOCATION"]["DAY_SEGMENTS"]),``
2020-03-06 19:38:05 +01:00
2020-05-29 00:49:03 +02:00
**Snakemake rule chain:**
2020-03-06 19:38:05 +01:00
2020-05-29 00:49:03 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/readable_datetime``
- Rule ``rules/preprocessing.snakefile/phone_sensed_bins``
- Rule ``rules/preprocessing.snakefile/resample_fused_location`` (only relevant if setting ``location_to_use`` to ````RESAMPLE_FUSED``.
- Rule ``rules/features.snakefile/location_barnett_features``
2020-03-06 19:38:05 +01:00
.. _location-parameters:
2020-05-29 00:49:03 +02:00
**Location Rule Parameters (location_barnett_features):**
2020-03-06 19:38:05 +01:00
================= ===================
Name Description
================= ===================
location_to_use *Read the Observations section below*. The specifies what type of location data will be use in the analysis. Possible options are ``ALL``, ``ALL_EXCEPT_FUSED`` OR ``RESAMPLE_FUSED``
2020-03-06 19:38:05 +01:00
accuracy_limit This is in meters. The sensor drops location coordinates with an accuracy higher than this. This number means there's a 68% probability the true location is within this radius specified.
timezone The timezone used to calculate location.
minutes_data_used This is NOT a feature. This is just a quality control check, and if set to TRUE, a new column is added to the output file with the number of minutes containing location data that were used to compute all features. The more data minutes exist for a period, the more reliable its features should be. For fused location, a single minute can contain more than one coordinate pair if the participant is moving fast enough.
2020-05-29 00:49:03 +02:00
features Features to be computed, see table below
2020-03-06 19:38:05 +01:00
================= ===================
.. _location-available-features:
2020-03-06 19:38:05 +01:00
**Available Location Features**
2020-03-06 19:38:05 +01:00
2020-05-29 00:49:03 +02:00
Description taken from `Beiwe Summary Statistics`_.
2020-03-06 19:38:05 +01:00
================ ========= =============
Name Units Description
================ ========= =============
2020-05-19 18:34:45 +02:00
hometime minutes Time at home. Time spent at home in minutes. Home is the most visited significant location between 8 pm and 8 am including any pauses within a 200-meter radius.
disttravelled meters Total distance travelled over a day (flights).
rog meters The Radius of Gyration (rog) is a measure in meters of the area covered by a person over a day. A centroid is calculated for all the places (pauses) visited during a day and a weighted distance between all the places and that centroid is computed. The weights are proportional to the time spent in each place.
maxdiam meters The maximum diameter is the largest distance between any two pauses.
2020-05-14 22:06:13 +02:00
maxhomedist meters The maximum distance from home in meters.
siglocsvisited locations The number of significant locations visited during the day. Significant locations are computed using k-means clustering over pauses found in the whole monitoring period. The number of clusters is found iterating k from 1 to 200 stopping until the centroids of two significant locations are within 400 meters of one another.
2020-05-19 18:34:45 +02:00
avgflightlen meters Mean length of all flights.
stdflightlen meters Standard deviation of the length of all flights.
2020-06-04 23:01:11 +02:00
avgflightdur seconds Mean duration of all flights.
stdflightdur seconds The standard deviation of the duration of all flights.
2020-05-14 22:06:13 +02:00
probpause The fraction of a day spent in a pause (as opposed to a flight)
2020-05-19 18:34:45 +02:00
siglocentropy nats Shannons entropy measurement based on the proportion of time spent at each significant location visited during a day.
circdnrtn A continuous metric quantifying a persons circadian routine that can take any value between 0 and 1, where 0 represents a daily routine completely different from any other sensed days and 1 a routine the same as every other sensed day.
2020-05-29 00:49:03 +02:00
wkenddayrtn Same as circdnrtn but computed separately for weekends and weekdays.
2020-03-06 19:38:05 +01:00
================ ========= =============
**Assumptions/Observations:**
*Types of location data to use*
Aware Android and iOS clients can collect location coordinates through the phone's GPS or Google's fused location API. If your Aware client was ONLY configured to use GPS set ``location_to_use`` to ``ALL``, if your client was configured to use BOTH GPS and fused location you can use ``ALL`` or set ``location_to_use`` to ``ALL_EXCEPT_FUSED`` to ignore fused coordinates, if your client was configured to use fused location only, set ``location_to_use`` to ``RESAMPLE_FUSED``. ``RESAMPLE_FUSED`` takes the original fused location coordinates and replicates each pair forward in time as long as the phone was sensing data as indicated by ``phone_sensed_bins`` (see :ref:`Phone valid sensed days <phone-valid-sensed-days>`), this is done because Google's API only logs a new location coordinate pair when it is sufficiently different from the previous one.
There are two parameters associated with resampling fused location in the ``RESAMPLE_FUSED_LOCATION`` section of the ``config.yaml`` file. ``CONSECUTIVE_THRESHOLD`` (in minutes, default 30) controls the maximum gap between any two coordinate pairs to replicate the last known pair (for example, participant A's phone did not collect data between 10.30am and 10:50am and between 11:05am and 11:40am, the last known coordinate pair will be replicated during the first period but not the second, in other words, we assume that we cannot longer guarantee the participant stayed at the last known location if the phone did not sense data for more than 30 minutes). ``TIME_SINCE_VALID_LOCATION`` (in minutes, default 720 or 12 hours) the last known fused location won't be carried over longer that this threshold even if the phone was sensing data continuously (for example, participant A went home at 9pm and their phone was sensing data without gaps until 11am the next morning, the last known location will only be replicated until 9am). If you have suggestions to modify or improve this imputation, let us know.
2020-05-29 00:49:03 +02:00
2020-03-06 19:38:05 +01:00
*Significant Locations Identified*
(i.e. The clustering method used)
2020-05-29 00:49:03 +02:00
Significant locations are determined using K-means clustering on locations that a patient visit over the course of the period of data collection. By setting K=K+1 and repeat clustering until two significant locations are within 100 meters of one another, the results from the previous step (K-1) can be used as the total number of significant locations. Taken from `Beiwe Summary Statistics`_.
2020-03-06 19:38:05 +01:00
*Definition of Stationarity*
2020-05-29 00:49:03 +02:00
(i.e., The length of time and distance a person has to be around the same place to be labelled as a pause)
This is based on a Pause-Flight model, The parameters used are a minimum pause duration of 300sec and a minimum pause distance of 60m. See the `Pause-Flight Model`_.
2020-03-06 19:38:05 +01:00
*The Circadian Calculation*
2020-05-29 00:49:03 +02:00
For a detailed description of how this is calculated, see Canzian, L., & Musolesi, M. (2015, September). Trajectories of depression: unobtrusive monitoring of depressive states by means of smartphone mobility traces analysis. In Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing (pp. 1293-1304). Their procedure was followed using 30-min increments as a bin size. Taken from `Beiwe Summary Statistics`_.
2020-03-06 19:38:05 +01:00
.. _screen-sensor-doc:
2019-12-18 06:08:33 +01:00
Screen
2020-03-06 19:38:05 +01:00
""""""""
See `Screen Config Code`_
2020-05-29 01:01:46 +02:00
**Available Epochs (day_segment) :** daily, morning, afternoon, evening, night
2020-03-06 19:38:05 +01:00
2020-05-29 01:01:46 +02:00
**Available Platforms:** Android and iOS
2020-03-06 19:38:05 +01:00
2020-05-29 01:01:46 +02:00
**Snakefile entry to compute these features:**
2020-03-06 19:38:05 +01:00
| ``expand("data/processed/{pid}/screen_{day_segment}.csv",``
| ``pid=config["PIDS"],``
| ``day_segment = config["SCREEN"]["DAY_SEGMENTS"]),``
2020-05-29 01:01:46 +02:00
**Snakemake rule chain:**
2020-05-29 01:01:46 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/readable_datetime``
- Rule ``rules/features.snakefile/screen_deltas``
- Rule ``rules/features.snakefile/screen_features``
2020-03-06 19:38:05 +01:00
.. _screen-parameters:
2020-05-29 01:01:46 +02:00
**Screen Rule Parameters (screen_features):**
2020-05-14 22:06:13 +02:00
========================= ===================
Name Description
========================= ===================
day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night``
reference_hour_first_use The reference point from which ``firstuseafter`` is to be computed, default is midnight
2020-05-29 01:01:46 +02:00
features_deltas Features to be computed, see table below
episode_types Currently we only support unlock episodes (from when the phone is unlocked until the screen is off)
2020-05-14 22:06:13 +02:00
========================= ===================
.. _screen-episodes-available-features:
2020-03-06 19:38:05 +01:00
**Available Screen Episodes Features**
2020-03-06 19:38:05 +01:00
2020-05-19 18:34:45 +02:00
========================= ================= =============
Name Units Description
========================= ================= =============
sumduration seconds Total duration of all unlock episodes.
maxduration seconds Longest duration of any unlock episode.
minduration seconds Shortest duration of any unlock episode.
avgduration seconds Average duration of all unlock episodes.
stdduration seconds Standard deviation duration of all unlock episodes.
countepisode episodes Number of all unlock episodes
episodepersensedminutes episodes/minute The ratio between the total number of episodes in an epoch divided by the total time (minutes) the phone was sensing data.
firstuseafter seconds Seconds until the first unlock episode.
========================= ================= =============
2020-03-06 19:38:05 +01:00
**Assumptions/Observations:**
2020-05-29 01:01:46 +02:00
An ``unlock`` episode is considered as the time between an ``unlock`` event and a ``lock`` event. iOS recorded these episodes reliably (albeit some duplicated ``lock`` events within milliseconds from each other). However, in Android there are some events unrelated to the screen state because of multiple consecutive ``unlock``/``lock`` events, so we keep the closest pair. In our experiments these cases are less than 10% of the screen events collected. This happens because ``ACTION_SCREEN_OFF`` and ``ON`` are "sent when the device becomes non-interactive which may have nothing to do with the screen turning off". Additionally, in Android it is possible to measure the time spent on the ``lock`` screen before an ``unlock`` event as well as the total screen time (i.e. ``ON`` to ``OFF``) but we are only keeping ``unlock`` episodes (``unlock`` to ``OFF``) to be consistent with iOS.
2020-03-06 19:38:05 +01:00
.. ------------------------------- Begin Fitbit Section ----------------------------------- ..
.. _fitbit-sleep-sensor-doc:
Fitbit: Sleep
"""""""""""""""""""
See `Fitbit: Sleep Config Code`_
2020-05-29 01:13:29 +02:00
**Available Epochs (day_segment) :** daily
2020-05-29 01:13:29 +02:00
**Available Platforms:**: Fitbit
2020-05-29 01:13:29 +02:00
**Snakefile entry to compute these features:**
| ``expand("data/processed/{pid}/fitbit_sleep_{day_segment}.csv",``
| ``pid = config["PIDS"],``
| ``day_segment = config["SLEEP"]["DAY_SEGMENTS"]),``
2020-05-29 01:13:29 +02:00
**Snakemake rule chain:**
2020-05-29 01:27:52 +02:00
2020-05-29 01:13:29 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/fitbit_with_datetime``
- Rule ``rules/features.snakefile/fitbit_sleep_features``
.. _fitbit-sleep-parameters:
2020-05-29 01:13:29 +02:00
**Fitbit: Sleep Rule Parameters (fitbit_sleep_features):**
================================== ===================
Name Description
================================== ===================
day_segment The particular ``day_segment`` that will be analyzed. For this sensor only ``daily`` is used.
2020-05-28 20:17:35 +02:00
sleep_types The types of sleep provided by Fitbit: ``main``, ``nap``, ``all``.
daily_features_from_summary_data The sleep features that can be computed based on Fitbit's summary data. See :ref:`Available Fitbit: Sleep Features <fitbit-sleep-available-features>` Table below
================================== ===================
.. _fitbit-sleep-available-features:
**Available Fitbit: Sleep Features**
======================== =========== =============
Name Units Description
======================== =========== =============
2020-05-28 20:17:35 +02:00
sumdurationtofallasleep minutes Time it took the user to fall asleep for ``sleep_type`` during ``day_segment``.
2020-05-28 20:48:34 +02:00
sumdurationawake minutes Time the user was awake but still in bed for ``sleep_type`` during ``day_segment``.
sumdurationasleep minutes Sleep duration for ``sleep_type`` during ``day_segment``.
sumdurationafterwakeup minutes Time the user stayed in bed after waking up for ``sleep_type`` during ``day_segment``.
sumdurationinbed minutes Total time the user stayed in bed (sumdurationtofallasleep + sumdurationawake + sumdurationasleep + sumdurationafterwakeup) for ``sleep_type`` during ``day_segment``.
2020-05-28 20:17:35 +02:00
avgefficiency scores Sleep efficiency average for ``sleep_type`` during ``day_segment``.
countepisode episodes Number of sleep episodes for ``sleep_type`` during ``day_segment``.
======================== =========== =============
**Assumptions/Observations:**
The `fitbit_with_datetime` rule will extract Summary data (`fitbit_sleep_summary_with_datetime.csv`) Intraday data (`fitbit_sleep_intraday_with_datetime.csv`). There are two versions of Fitbit's sleep API(`version 1`_ and `version 1.2`_), and each provides raw sleep data with different formats.
The differences between both API versions are:
- Sleep level. In `v1`, it is an integer with three possible values {1, 2, 3} while in `v1.2` it is a string. We convert integer levels of `v1` to strings: "asleep", "restless" or "awake" respectively.
- Count summaries. For Summary data, `v1` contains "count_awake", "duration_awake", "count_awakenings", "count_restless", and "duration_restless" fields in the summary of each sleep record while `v1.2` does not.
- Types of sleep records. `v1.2` has two types of sleep records: "classic" and "stages". The "classic" type contains three sleep levels: "awake", "restless" and "asleep". The "stages" type contains four sleep levels {"wake", "deep", "light", "rem"}. Sleep records from `v1` will have the same sleep levels as `v1.2` classic types; therefore we set their type to "classic".
- Unified level of sleep. For intraday data, we unify sleep levels of each sleep record with a column named "unified_level". Based on `this Fitbit forum post`_ , we merge levels into two categories:
- For the "classic" type: unified_level is one of {0, 1} where 0 means awake and groups "awake" + "restless", while 1 means asleep and groups "asleep".
- For the "stages" type, unified_level is one of {0, 1} where 0 means awake and groups "wake" while 1 means asleep and groups "deep" + "light" + "rem".
- Short Data. In `v1.2`, records of type "stages" contain "shortData" in addition to "data". We merge "data" part and "shortData" part to extract intraday data.
- The "data" grouping displays the sleep stages and any wake periods > 3 minutes (180 seconds).
- The "shortData" grouping displays the short wake periods representing physiological awakenings that are <= 3 minutes (180 seconds).
- The following columns of Summary data are not computed by RAPIDS but taken directly from columns with a similar name provided by the API: `efficiency`, `minutes_after_wakeup`, `minutes_asleep`, `minutes_awake`, `minutes_to_fall_asleep`, `minutes_in_bed`, `is_main_sleep` and `type`
- The following columns of Intraday data are not computed by RAPIDS but taken directly from columns with a similar name provided by the API: `original_level`, `is_main_sleep` and `type`. We compute `unified_level` as explained above.
Detailed sleep data is stored in Intraday data every 30 seconds (for "stages" type) or 60 seconds (for "classic" type) while a summary is stored in Summary data. For example:
- Intraday data
========= ============== ============= ============= ====== =================== ========== =========== ========= ================= ========== ========== ============ =================
device_id original_level unified_level is_main_sleep type local_date_time local_date local_month local_day local_day_of_week local_time local_hour local_minute local_day_segment
========= ============== ============= ============= ====== =================== ========== =========== ========= ================= ========== ========== ============ =================
did wake 0 1 stages 2020-05-20 22:13:30 2020-05-20 5 20 2 22:13:30 22 13 evening
did wake 0 1 stages 2020-05-20 22:14:00 2020-05-20 5 20 2 22:14:00 22 14 evening
did light 1 1 stages 2020-05-20 22:14:30 2020-05-20 5 20 2 22:14:30 22 14 evening
did light 1 1 stages 2020-05-20 22:15:00 2020-05-20 5 20 2 22:15:00 22 15 evening
did light 1 1 stages 2020-05-20 22:15:30 2020-05-20 5 20 2 22:15:30 22 15 evening
========= ============== ============= ============= ====== =================== ========== =========== ========= ================= ========== ========== ============ =================
- Summary data
========= ========== ==================== ============== ============= ====================== ============== ============= ====== ===================== =================== ================ ============== ======================= =====================
device_id efficiency minutes_after_wakeup minutes_asleep minutes_awake minutes_to_fall_asleep minutes_in_bed is_main_sleep type local_start_date_time local_end_date_time local_start_date local_end_date local_start_day_segment local_end_day_segment
========= ========== ==================== ============== ============= ====================== ============== ============= ====== ===================== =================== ================ ============== ======================= =====================
did 90 0 381 54 0 435 1 stages 2020-05-20 22:12:00 2020-05-21 05:27:00 2020-05-20 2020-05-21 evening night
did 88 0 498 86 0 584 1 stages 2020-05-22 22:03:00 2020-05-23 07:47:03 2020-05-22 2020-05-23 evening morning
========= ========== ==================== ============== ============= ====================== ============== ============= ====== ===================== =================== ================ ============== ======================= =====================
2020-03-06 19:38:05 +01:00
.. _fitbit-heart-rate-sensor-doc:
Fitbit: Heart Rate
"""""""""""""""""""
See `Fitbit: Heart Rate Config Code`_
2020-05-29 01:13:29 +02:00
**Available Epochs (day_segment) :** daily, morning, afternoon, evening, night
2020-03-06 19:38:05 +01:00
2020-05-29 01:13:29 +02:00
**Available Platforms:**: Fitbit
2020-03-06 19:38:05 +01:00
2020-05-29 01:13:29 +02:00
**Snakefile entry to compute these features:**
2020-03-06 19:38:05 +01:00
| ``expand("data/processed/{pid}/fitbit_heartrate_{day_segment}.csv",``
| ``pid=config["PIDS"],``
| ``day_segment = config["HEARTRATE"]["DAY_SEGMENTS"]),``
2020-05-29 01:13:29 +02:00
**Snakemake rule chain:**
2020-03-06 19:38:05 +01:00
2020-05-29 01:13:29 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/fitbit_with_datetime``
- Rule ``rules/features.snakefile/fitbit_heartrate_features``
2020-03-06 19:38:05 +01:00
.. _fitbit-heart-rate-parameters:
2020-05-29 01:13:29 +02:00
**Fitbit: Heart Rate Rule Parameters (fitbit_heartrate_features):**
2020-03-06 19:38:05 +01:00
============ ===================
Name Description
============ ===================
2020-05-19 18:34:45 +02:00
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night``
2020-05-28 20:17:35 +02:00
features The heartrate features that can be computed. See :ref:`Available Fitbit: Heart Rate Features <fitbit-heart-rate-available-features>` Table below
2020-03-06 19:38:05 +01:00
============ ===================
.. _fitbit-heart-rate-available-features:
2020-03-06 19:38:05 +01:00
**Available Fitbit: Heart Rate Features**
2020-03-06 19:38:05 +01:00
================== =========== =============
Name Units Description
================== =========== =============
restingheartrate beats/mins The number of times your heart beats per minute when participant is still and well rested for ``daily`` epoch.
calories cals Calories burned during ``heartrate_zone`` for ``daily`` epoch.
maxhr beats/mins The maximum heart rate during ``day_segment`` epoch.
minhr beats/mins The minimum heart rate during ``day_segment`` epoch.
avghr beats/mins The average heart rate during ``day_segment`` epoch.
medianhr beats/mins The median of heart rate during ``day_segment`` epoch.
modehr beats/mins The mode of heart rate during ``day_segment`` epoch.
stdhr beats/mins The standard deviation of heart rate during ``day_segment`` epoch.
2020-05-28 20:17:35 +02:00
diffmaxmodehr beats/mins The difference between the maximum and mode heart rate during ``day_segment`` epoch.
diffminmodehr beats/mins The difference between the mode and minimum heart rate during ``day_segment`` epoch.
entropyhr nats Shannons entropy measurement based on heart rate during ``day_segment`` epoch.
2020-05-28 20:17:35 +02:00
lengthZONE minutes Number of minutes the user's heartrate fell within each ``heartrate_zone`` during ``day_segment`` epoch.
2020-03-06 19:38:05 +01:00
================== =========== =============
**Assumptions/Observations:**
2020-05-28 20:17:35 +02:00
There are four heart rate zones: ``out_of_range``, ``fat_burn``, ``cardio``, and ``peak``. Please refer to `Fitbit documentation`_ for more information about the way they are computed.
Calories' accuracy depends on the users Fitbit profile (weight, height, etc.).
2020-03-06 19:38:05 +01:00
.. _fitbit-steps-sensor-doc:
Fitbit: Steps
"""""""""""""""
See `Fitbit: Steps Config Code`_
2020-05-29 01:13:29 +02:00
**Available Epochs (day_segment) :** daily, morning, afternoon, evening, night
2020-03-06 19:38:05 +01:00
2020-05-29 01:13:29 +02:00
**Available Platforms:**: Fitbit
2020-03-06 19:38:05 +01:00
2020-05-29 01:13:29 +02:00
**Snakefile entry to compute these features:**
2020-03-06 19:38:05 +01:00
| ``expand("data/processed/{pid}/fitbit_step_{day_segment}.csv",``
| ``pid=config["PIDS"],``
| ``day_segment = config["STEP"]["DAY_SEGMENTS"]),``
2020-05-29 01:13:29 +02:00
**Snakemake rule chain:**
2020-03-06 19:38:05 +01:00
2020-05-29 01:13:29 +02:00
- Rule ``rules/preprocessing.snakefile/download_dataset``
- Rule ``rules/preprocessing.snakefile/fitbit_with_datetime``
- Rule ``rules/features.snakefile/fitbit_step_features``
2020-03-06 19:38:05 +01:00
.. _fitbit-steps-parameters:
2020-05-29 01:13:29 +02:00
**Fitbit: Steps Rule Parameters (fitbit_step_features):**
2020-03-06 19:38:05 +01:00
======================= ===================
Name Description
======================= ===================
2020-05-19 18:34:45 +02:00
day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night``
2020-05-28 20:17:35 +02:00
features The features that can be computed. See :ref:`Available Fitbit: Steps Features <fitbit-steps-available-features>` Table below
threshold_active_bout Every minute with Fitbit step data wil be labelled as ``sedentary`` if its step count is below this threshold, otherwise, ``active``.
2020-03-06 19:38:05 +01:00
======================= ===================
.. _fitbit-steps-available-features:
2020-03-06 19:38:05 +01:00
**Available Fitbit: Steps Features**
2020-03-06 19:38:05 +01:00
========================= ========= =============
Name Units Description
========================= ========= =============
sumallsteps steps The total step count during ``day_segment`` epoch.
maxallsteps steps The maximum step count during ``day_segment`` epoch.
minallsteps steps The minimum step count during ``day_segment`` epoch.
avgallsteps steps The average step count during ``day_segment`` epoch.
stdallsteps steps The standard deviation of step count during ``day_segment`` epoch.
countsedentarybout bouts Number of sedentary bouts during ``day_segment`` epoch.
2020-05-28 20:17:35 +02:00
maxdurationsedentarybout minutes The maximum duration of any sedentary bout during ``day_segment`` epoch.
mindurationsedentarybout minutes The minimum duration of any sedentary bout during ``day_segment`` epoch.
avgdurationsedentarybout minutes The average duration of sedentary bouts during ``day_segment`` epoch.
stddurationsedentarybout minutes The standard deviation of the duration of sedentary bouts during ``day_segment`` epoch.
countactivebout bouts Number of active bouts during ``day_segment`` epoch.
2020-05-28 20:17:35 +02:00
maxdurationactivebout minutes The maximum duration of any active bout during ``day_segment`` epoch.
mindurationactivebout minutes The minimum duration of any active bout during ``day_segment`` epoch.
avgdurationactivebout minutes The average duration of active bouts during ``day_segment`` epoch.
stddurationactivebout minutes The standard deviation of the duration of active bouts during ``day_segment`` epoch.
2020-03-06 19:38:05 +01:00
========================= ========= =============
**Assumptions/Observations:**
2020-05-28 20:17:35 +02:00
Active and sedentary bouts. If the step count per minute is smaller than ``THRESHOLD_ACTIVE_BOUT`` (default value is 10), that minute is labelled as sedentary, otherwise, is labelled as active. Active and sedentary bouts are periods of consecutive minutes labelled as ``active`` or ``sedentary``.
2020-03-06 19:38:05 +01:00
2020-03-06 19:38:05 +01:00
.. -------------------------Links ------------------------------------ ..
.. _SENSORS: https://github.com/carissalow/rapids/blob/f22d1834ee24ab3bcbf051bc3cc663903d822084/config.yaml#L2
.. _`SMS Config Code`: https://github.com/carissalow/rapids/blob/f22d1834ee24ab3bcbf051bc3cc663903d822084/config.yaml#L38
.. _AWARE: https://awareframework.com/what-is-aware/
.. _`List of Timezones`: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
2020-05-19 18:34:45 +02:00
.. _sms_features: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L1
2020-05-14 22:06:13 +02:00
.. _sms_features.R: https://github.com/carissalow/rapids/blob/master/src/features/sms_featues.R
.. _download_dataset: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/preprocessing.snakefile#L9
.. _download_dataset.R: https://github.com/carissalow/rapids/blob/master/src/data/download_dataset.R
.. _readable_datetime: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/preprocessing.snakefile#L21
.. _readable_datetime.R: https://github.com/carissalow/rapids/blob/master/src/data/readable_datetime.R
.. _DAY_SEGMENTS: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L13
.. _PHONE_VALID_SENSED_DAYS: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L60
.. _`Call Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L46
.. _call_features: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L13
.. _call_features.R: https://github.com/carissalow/rapids/blob/master/src/features/call_features.R
.. _`Bluetooth Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L76
.. _bluetooth_feature: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L63
.. _bluetooth_features.R: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/src/features/bluetooth_features.R
2020-03-06 19:38:05 +01:00
.. _`Accelerometer Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L98
2020-05-14 22:06:13 +02:00
.. _accelerometer_features: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L124
.. _accelerometer_features.py: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/src/features/accelerometer_featues.py
2020-03-06 19:38:05 +01:00
.. _`Applications Foreground Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L102
.. _`Application Genres Config`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L54
.. _application_genres: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/preprocessing.snakefile#L81
.. _application_genres.R: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/src/data/application_genres.R
.. _applications_foreground_features: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L135
.. _applications_foreground_features.py: https://github.com/carissalow/rapids/blob/master/src/features/accelerometer_features.py
2020-03-06 19:38:05 +01:00
.. _`Battery Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L84
.. _battery_deltas: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L25
.. _battery_deltas.R: https://github.com/carissalow/rapids/blob/master/src/features/battery_deltas.R
.. _battery_features: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L86
.. _battery_features.py : https://github.com/carissalow/rapids/blob/master/src/features/battery_features.py
2020-03-06 19:38:05 +01:00
.. _`Google Activity Recognition Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L80
.. _google_activity_recognition_deltas: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L41
.. _google_activity_recognition_deltas.R: https://github.com/carissalow/rapids/blob/master/src/features/google_activity_recognition_deltas.R
.. _activity_features: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L74
2020-03-06 19:38:05 +01:00
.. _google_activity_recognition.py: https://github.com/carissalow/rapids/blob/master/src/features/google_activity_recognition.py
.. _`Light Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L94
.. _light_features: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L113
.. _light_features.py: https://github.com/carissalow/rapids/blob/master/src/features/light_features.py
2020-03-06 19:38:05 +01:00
.. _`Location (Barnetts) Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L70
.. _phone_sensed_bins: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/preprocessing.snakefile#L46
.. _phone_sensed_bins.R: https://github.com/carissalow/rapids/blob/master/src/data/phone_sensed_bins.R
.. _resample_fused_location: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/preprocessing.snakefile#L67
.. _resample_fused_location.R: https://github.com/carissalow/rapids/blob/master/src/data/resample_fused_location.R
.. _location_barnett_features: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L49
.. _location_barnett_features.R: https://github.com/carissalow/rapids/blob/master/src/features/location_barnett_features.R
2020-03-06 19:38:05 +01:00
.. _`Screen Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L88
.. _screen_deltas: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L33
.. _screen_deltas.R: https://github.com/carissalow/rapids/blob/master/src/features/screen_deltas.R
.. _screen_features: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L97
.. _screen_features.py: https://github.com/carissalow/rapids/blob/master/src/features/screen_features.py
2020-03-06 19:38:05 +01:00
.. _fitbit_with_datetime: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/preprocessing.snakefile#L94
.. _fitbit_readable_datetime.py: https://github.com/carissalow/rapids/blob/master/src/data/fitbit_readable_datetime.py
.. _`Fitbit: Sleep Config Code`: https://github.com/carissalow/rapids/blob/e952e27350c7ae02703bd444e8f92979e37d9ba6/config.yaml#L129
.. _fitbit_sleep_features: https://github.com/carissalow/rapids/blob/e952e27350c7ae02703bd444e8f92979e37d9ba6/rules/features.snakefile#L209
.. _fitbit_sleep_features.py: https://github.com/carissalow/rapids/blob/master/src/features/fitbit_sleep_features.py
.. _`version 1`: https://dev.fitbit.com/build/reference/web-api/sleep-v1/
.. _`version 1.2`: https://dev.fitbit.com/build/reference/web-api/sleep/
.. _`this Fitbit forum post`: https://community.fitbit.com/t5/Alta/What-does-Restless-mean-in-sleep-tracking/td-p/2989011
.. _ shortData: https://dev.fitbit.com/build/reference/web-api/sleep/#interpreting-the-sleep-stage-and-short-data
.. _`Fitbit: Heart Rate Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L113
.. _fitbit_heartrate_features: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L151
.. _fitbit_heartrate_features.py: https://github.com/carissalow/rapids/blob/master/src/features/fitbit_heartrate_features.py
2020-03-06 19:38:05 +01:00
.. _`Fitbit: Steps Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L117
.. _fitbit_step_features: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L162
.. _fitbit_step_features.py: https://github.com/carissalow/rapids/blob/master/src/features/fitbit_step_features.py
2020-03-06 19:38:05 +01:00
.. _`Fitbit documentation`: https://help.fitbit.com/articles/en_US/Help_article/1565
.. _`Custom Catalogue File`: https://github.com/carissalow/rapids/blob/master/data/external/stachl_application_genre_catalogue.csv
.. _top1global: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L108
.. _`Beiwe Summary Statistics`: http://wiki.beiwe.org/wiki/Summary_Statistics
2020-05-21 20:40:51 +02:00
.. _`Pause-Flight Model`: https://academic.oup.com/biostatistics/advance-article/doi/10.1093/biostatistics/kxy059/5145908