From 1a4cae80e70e6baf125ff77df061614ca677cb04 Mon Sep 17 00:00:00 2001 From: kaguillera Date: Fri, 6 Mar 2020 13:38:05 -0500 Subject: [PATCH] Completed the sensor documentation --- docs/features/extracted.rst | 1120 ++++++++++++++++++++++++++++------- docs/usage/introduction.rst | 19 +- 2 files changed, 903 insertions(+), 236 deletions(-) diff --git a/docs/features/extracted.rst b/docs/features/extracted.rst index d61f318a..9780c500 100644 --- a/docs/features/extracted.rst +++ b/docs/features/extracted.rst @@ -3,52 +3,52 @@ RAPIDS Metrics =============== -This following is documentation of on the RAPIDS metrics settings in the configuation file. +This following is documentation of on the RAPIDS metrics settings in the configuration file. .. _sensor-list: - - ``SENSORS`` - This varable stores a list of the names of the sensor data that are being pulled from the AWARE_ database. These names are the actual names of the tables that the data is found in the database. See SENSORS_ variable in ``config`` file. +- ``SENSORS`` - This variable stores a list of the names of the sensor data that are being pulled from the AWARE_ database. These names are the actual names of the tables that the data is found in the database. See SENSORS_ variable in ``config`` file. .. _fitbit-table: - - ``FITBIT_TABLE`` - The name of the fitbit database +- ``FITBIT_TABLE`` - The name of the fitbit database .. _fitbit-sensors: - - ``FITBIT_SENSORS`` - The list of sensors that to be pulled from the fitbit database +- ``FITBIT_SENSORS`` - The list of sensors to be pulled from the fitbit database .. _pid: - - ``PID`` - The list of participant ids included in the analysis. Remember that you must create a file named ``pXXX`` for each participant in the ``data/external`` directory containing there device_id. (Remember installation :ref:`step 8 `) +- ``PID`` - The list of participant ids included in the analysis. Remember that you must create a file named ``pXXX`` for each participant in the ``data/external`` directory containing their device_id. (Remember installation step 8 on the :ref:`install-page`) .. _day-segments: - - ``DAY_SEGMENTS`` - The list of common day segments (time frequency/checkpoints) that data would be analyzed. See DAY_SEGMENTS_ in ``config`` file. +- ``DAY_SEGMENTS`` - The list of common day segments (time frequency/checkpoints) that will be analyzed. See DAY_SEGMENTS_ in ``config`` file. .. _timezone: - - ``TIMEZONE`` - The timezone of the server. Use the timezone names from this `List of Timezones`_. Double check your code, for example EST is not US Eastern Time. +- ``TIMEZONE`` - The timezone of the server. Use the timezone names from this `List of Timezones`_. Double check your code, for example EST is not US Eastern Time. .. _database_group: - - ``DATABASE_GROUP`` - The name of the research project database. +- ``DATABASE_GROUP`` - The name of the research project database. .. _download-dataset: - - ``DOWNLOAD_DATASET`` - The name of the dataset for the research project. +- ``DOWNLOAD_DATASET`` - The name of the dataset for the research project. .. _readable-datetime: - - ``READABLE_DATETIME`` - Readable datetime configuration. Defines the format that the readable date and time should be. +- ``READABLE_DATETIME`` - Readable datetime configuration. Defines the format that the readable date and time should be. .. _phone-valid-sensed-days: - - ``PHONE_VALID_SENSED_DAYS`` - Specifies the ``BIN_SIZE``, ``MIN_VALID_HOURS``, ``MIN_BINS_PER_HOUR``. ``BIN_SIZE`` is the time that the data is aggregated. ``MIN_VALID_HOURS`` is the minimum numbers of hours data will be gathered within a 24 hour period (a day). Finally ``MIN_BINS_PER_HOUR`` specifies minimum number of bins that are captured per hour. This is out of the total possible number of bins that can be captured in an hour i.e. out of 60min/``BIN_SIZE`` bins. See PHONE_VALID_SENSED_DAYS_ in ``config`` file. +- ``PHONE_VALID_SENSED_DAYS`` - Specifies the ``BIN_SIZE``, ``MIN_VALID_HOURS``, ``MIN_BINS_PER_HOUR``. ``BIN_SIZE`` is the time that the data is aggregated. ``MIN_VALID_HOURS`` is the minimum numbers of hours data will be gathered within a 24-hour period (a day). Finally, ``MIN_BINS_PER_HOUR`` specifies minimum number of bins that are captured per hour. This is out of the total possible number of bins that can be captured in an hour i.e. out of 60min/``BIN_SIZE`` bins. See PHONE_VALID_SENSED_DAYS_ in ``config`` file. .. _individual-sensor-settings: -List of Indvidual Sensors and There Settings +List of Individual Sensors and Their Settings --------------------------------------------- .. _sms-sensor-doc: @@ -60,23 +60,23 @@ See `SMS Config Code`_ **Available Epochs:** - - daily - - morning - - afternoon - - evening - - night +- daily +- morning +- afternoon +- evening +- night **Available Platforms:** - - Android +- Android **Snakefile Entry:** - - Download raw SMS dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` +.. - Download raw SMS dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` - - Download raw SMS dataset with readable: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` +.. - Apply readable datetime to SMS dataset: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` - - Extract SMS metrics +- Extract SMS metrics: | ``expand("data/processed/{pid}/sms_{sms_type}_{day_segment}.csv".`` | ``pid=config["PIDS"],`` @@ -85,17 +85,18 @@ See `SMS Config Code`_ **Rule Chain:** - - ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. - - ``src/data/download_dataset.R`` - See the download_dataset.R_ script. + - **Script:** ``src/data/download_dataset.R`` - See the download_dataset.R_ script. - - ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. +- **Rule:** ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. - - ``src/data/readable_datetime.R`` - See the readable_datetime.R_ script. + - **Script:** ``src/data/readable_datetime.R`` - See the readable_datetime.R_ script. - - ``rules/features.snakefile/sms_metrics`` - See the sms_metric_ rule. +- **Rule:** ``rules/features.snakefile/sms_metrics`` - See the sms_metric_ rule. + + - **Script:** ``src/features/sms_metrics.R`` - See the sms_metrics.R_ script. - - ``src/features/sms_metrics.R`` - See the sms_metrics.R_ script. .. _sms-parameters: @@ -104,11 +105,9 @@ See `SMS Config Code`_ ============ =================== Name Description ============ =================== -sms_type The particular ``sms_type`` that will be analyzed. The options for this parameter is ``received`` or ``sent``. -day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, - ``evening``, ``night`` -metrics The different measures that can be retrieved from the dataset. These metrics are available for both ``sent`` and ``received`` - SMS messages. See :ref:`Available SMS Metrices ` Table below +sms_type The particular ``sms_type`` that will be analyzed. The options for this parameter are ``received`` or ``sent``. +day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night`` +metrics The different measures that can be retrieved from the dataset. These metrics are available for both ``sent`` and ``received`` SMS messages. See :ref:`Available SMS Metrics ` Table below ============ =================== .. _sms-available-metrics: @@ -120,54 +119,52 @@ The following table shows a list of the available metrics for both ``sent`` and ========================= ========= ============= Name Units Description ========================= ========= ============= -count SMS A count of the number of times that particular ``sms_type`` occured for a particular ``day_segment``. -distinctcontacts contacts A count of distinct contacts that were comunicated for a particular ``sms_type`` for a particular - ``day_segment``. -timefirstsms minutes The time in minutes from 12:00am (Midnight) that the first of a particular ``sms_type`` occured. -timelastsms minutes The time in minutes from 12:00am (Midnight) that the last of a particular ``sms_type`` occured. -countmostfrequentcontact SMS The count of the number of sms meassages of a particular``sms_type`` for the most contacted contact for - a particular ``day_segment``. +count SMS A count of the number of times that particular ``sms_type`` occurred for a particular ``day_segment``. +distinctcontacts contacts A count of distinct contacts that were communicated for a particular ``sms_type`` for a particular ``day_segment``. +timefirstsms minutes The time in minutes from 12:00am (Midnight) that the first of a particular ``sms_type`` occurred. +timelastsms minutes The time in minutes from 12:00am (Midnight) that the last of a particular ``sms_type`` occurred. +countmostfrequentcontact SMS The count of the number of sms messages of a particular``sms_type`` for the most contacted contact for a particular ``day_segment``. ========================= ========= ============= -Assumptions/Observations: +**Assumptions/Observations:** #. ``TYPES`` and ``METRICS`` keys need to match. From example:: SMS: - TYPES : [sent] + TYPES: [sent] METRICS: sent: [count, distinctcontacts, timefirstsms, timelastsms, countmostfrequentcontact] -In the above config setting code the ``TYPE`` ``sent`` matches the ``METRICS`` key ``sent``. +In the above config setting code the ``TYPE`` ``sent`` matches the ``METRICS`` key ``sent``. .. _call-sensor-doc: Calls -""""""""""""" +"""""" See `Call Config Code`_ **Available Epochs:** - - daily - - morning - - afternoon - - evening - - night +- daily +- morning +- afternoon +- evening +- night **Available Platforms:** - - Android - - iOS - +- Android +- iOS **Snakefile Entry:** - - Download raw Calls dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` +.. - Download raw Calls dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` - - Download raw Calls dataset with readable: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` - - Extract Calls Metrics +.. - Apply readable datetime to Calls dataset: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +- Extract Calls Metrics | ``expand("data/processed/{pid}/call_{call_type}_{segment}.csv",`` | ``pid=config["PIDS"],`` @@ -176,88 +173,82 @@ See `Call Config Code`_ **Rule Chain:** - - ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. - - ``src/data/download_dataset.R`` - See the download_dataset.R_ script. - - - ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. + - **Script:** ``src/data/download_dataset.R`` - See the download_dataset.R_ script. - - ``src/data/readable_datetime.R`` - See the readable_datetime.R_ script. +- **Rule:** ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. - - ``rules/features.snakefile/call_metrics`` - See the call_metrics_ rule. + - **Script:** ``src/data/readable_datetime.R`` - See the readable_datetime.R_ script. - - ``src/features/call_metrics.R`` - See the call_metrics.R_ script. +- **Rule:** ``rules/features.snakefile/call_metrics`` - See the call_metrics_ rule. + + - **Script:** ``src/features/call_metrics.R`` - See the call_metrics.R_ script. .. _calls-parameters: -**Sensor Rule Parameters:** +**Call Rule Parameters:** ============ =================== Name Description ============ =================== call_type The particular ``call_type`` that will be analyzed. The options for this parameter are ``incoming``, ``outgoing`` or ``missed``. -day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, - ``evening``, ``night`` -metrics The different measures that can be retrieved from the calls dataset. Note that the same metrics are available for both - ``incoming`` and ``outgoing`` calls, while ``missed`` calls has its own set of metrics. See :ref:`Available Incoming and Outgoing Call Metrices ` Table and :ref:`Available Missed Call Metrices ` Table below. +day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night`` +metrics The different measures that can be retrieved from the calls dataset. Note that the same metrics are available for both ``incoming`` and ``outgoing`` calls, while ``missed`` calls has its own set of metrics. See :ref:`Available Incoming and Outgoing Call Metrics ` Table and :ref:`Available Missed Call Metrics ` Table below. ============ =================== .. _available-in-and-out-call-metrics: -**Available Incoming and Outgoing Call Metrices** +**Available Incoming and Outgoing Call Metrics** The following table shows a list of the available metrics for ``incoming`` and ``outgoing`` calls. ========================= ========= ============= Name Units Description ========================= ========= ============= -count calls A count of the number of times that a particular ``call_type`` occured for a particular ``day_segment``. -distinctcontacts contacts A count of distinct contacts that were comunicated with for a particular ``call_type`` for a particular - ``day_segment`` +count calls A count of the number of times that a particular ``call_type`` occurred for a particular ``day_segment``. +distinctcontacts contacts A count of distinct contacts that were communicated with for a particular ``call_type`` for a particular ``day_segment`` meanduration minutes The mean duration of all calls for a particular ``call_type`` and ``day_segment``. sumduration minutes The sum of the duration of all calls for a particular ``call_type`` and ``day_segment``. minduration minutes The duration of the shortest call for a particular ``call_type`` and ``day_segment``. maxduration minutes The duration of the longest call for a particular ``call_type`` and ``day_segment``. stdduration minutes The standard deviation of all the calls for a particular ``call_type`` and ``day_segment``. modeduration minutes The mode duration of all the calls for a particular ``call_type`` and ``day_segment``. -hubermduration The generalized Huber M-estimator of location of the MAD for the durations of all the calls for a - particular ``call_type`` and ``day_segment``. -varqnduration The location-Free Scale Estimator Qn of the durations of all the calls for a particular ``call_type`` - and ``day_segment``. -entropyduration The estimates the Shannon entropy H of the durations of all the calls for a particular ``call_type`` - and ``day_segment``. -timefirstcall minutes The time in minutes from 12:00am (Midnight) that the first of ``call_type`` occured. -timelastcall minutes The time in minutes from 12:00am (Midnight) that the last of ``call_type`` occured. +hubermduration The generalized Huber M-estimator of location of the MAD for the durations of all the calls for a particular ``call_type`` and ``day_segment``. +varqnduration The Location-Free Scale Estimator Qn of the durations of all the calls for a particular ``call_type`` and ``day_segment``. +entropyduration The estimate of the Shannon entropy H of the durations of all the calls for a particular ``call_type`` and ``day_segment``. +timefirstcall minutes The time in minutes from 12:00am (Midnight) that the first of ``call_type`` occurred. +timelastcall minutes The time in minutes from 12:00am (Midnight) that the last of ``call_type`` occurred. countmostfrequentcontact calls The count of the number of calls of a particular ``call_type`` and ``day_segment`` for the most contacted contact. ========================= ========= ============= .. _available-missed-call-metrics: -**Available Missed Call Metrices** +**Available Missed Call Metrics** The following table shows a list of the available metrics for ``missed`` calls. ========================= ========= ============= Name Units Description ========================= ========= ============= -count calls A count of the number of times a ``missed`` call occured for a particular ``day_segment``. +count calls A count of the number of times a ``missed`` call occurred for a particular ``day_segment``. distinctcontacts contacts A count of distinct contacts whose calls were ``missed``. -timefirstcall minutes The time in minutes from 12:00am (Midnight) that the first ``missed`` call occured. -timelastcall minutes The time in minutes from 12:00am (Midnight) that the last ``missed`` call occured. +timefirstcall minutes The time in minutes from 12:00am (Midnight) that the first ``missed`` call occurred. +timelastcall minutes The time in minutes from 12:00am (Midnight) that the last ``missed`` call occurred. countmostfrequentcontact SMS The count of the number of ``missed`` calls for the contact with the most ``missed`` calls. ========================= ========= ============= -Assumptions/Observations: +**Assumptions/Observations:** #. ``TYPES`` and ``METRICS`` keys need to match. From example:: SMS: - TYPES : [missed] + TYPES: [missed] METRICS: missed: [count, distinctcontacts, timefirstsms, timelastsms, countmostfrequentcontact] -In the above config setting code the ``TYPE`` ``missed`` matches the ``METRICS`` key ``missed``. +In the above config setting code the ``TYPE`` ``missed`` matches the ``METRICS`` key ``missed``. .. _bluetooth-sensor-doc: @@ -269,24 +260,24 @@ See `Bluetooth Config Code`_ **Available Epochs:** - - daily - - morning - - afternoon - - evening - - night +- daily +- morning +- afternoon +- evening +- night **Available Platforms:** - - Android - - iOS - +- Android +- iOS **Snakefile Entry:** - - Download raw Bluetooth dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` +.. - Download raw Bluetooth dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` - - Download raw Bluetooth dataset with readable: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` - - Extract Bluetooth Metrics +.. - Apply readable datetime to Bluetooth dataset: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +- Extract Bluetooth Metrics | ``expand("data/processed/{pid}/bluetooth_{segment}.csv",`` | ``pid=config["PIDS"],`` @@ -294,17 +285,17 @@ See `Bluetooth Config Code`_ **Rule Chain:** - - ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. - - ``src/data/download_dataset.R`` See the download_dataset.R_ script. - - - ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. + - **Script:** ``src/data/download_dataset.R`` See the download_dataset.R_ script. - - ``src/data/readable_datetime.R`` See the readable_datetime.R_ script. +- **Rule:** ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. - - ``rules/features.snakefile/bluetooth_metrics`` - See the bluetooth_metric_ rule. + - **Script:** ``src/data/readable_datetime.R`` See the readable_datetime.R_ script. - - ``src/features/bluetooth_metrics.R`` - See the bluetooth_metrics.R_ script. +- **Rule:** ``rules/features.snakefile/bluetooth_metrics`` - See the bluetooth_metric_ rule. + + - **Script:** ``src/features/bluetooth_metrics.R`` - See the bluetooth_metrics.R_ script. .. _bluetooth-parameters: @@ -314,9 +305,8 @@ See `Bluetooth Config Code`_ ============ =================== Name Description ============ =================== -day_segment The particular ``day_segment`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, - ``evening``, ``night`` -metrics The different measures that can be retrieved from the Bluetooth dataset. See :ref:`Available Bluetooth Metrices ` Table below +day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night`` +metrics The different measures that can be retrieved from the Bluetooth dataset. See :ref:`Available Bluetooth Metrics ` Table below ============ =================== .. _bluetooth-available-metrics: @@ -333,176 +323,807 @@ uniquedevices devices Unique devices (number of unique dev countscansmostuniquedevice scans Count of scans of the most unique device across each participant’s dataset =========================== ========= ============= -Assumptions/Observations: N/A +**Assumptions/Observations:** N/A -.. _accelerometer: +.. _accelerometer-sensor-doc: Accelerometer --------------- +"""""""""""""" -Available epochs: daily, morning, afternoon, evening, and night +See `Accelerometer Config Code`_ -- Max magnitude: maximum magnitude of acceleration (:math:`\|acceleration\| = \sqrt{x^2 + y^2 + z^2}`) -- Min magnitude: minimum magnitude of acceleration -- Avg magnitude: average magnitude of acceleration -- Median magnitude: median magnitude of acceleration -- Std magnitude: standard deviation of acceleration -- Ratio exertional activity episodes: ratio of exertional activity time periods to total time periods -- Sum exertional activity episodes: total minutes of performing exertional activity during the epoch -- Longest exertional activity episode: longest episode of performing exertional activity -- Longest non-exertional activity episode: longest episode of performing non-exertional activity -- Count exertional activity episodes: count of the episods of performing exertional activity -- Count non-exertional activity episodes: count of the episodes of performing non-exertional activity +**Available epochs:** -.. _applications_foreground: +- daily +- morning +- afternoon +- evening +- night -Applications_foreground -------------------------- +**Available platforms:** -Available epochs: daily, morning, afternoon, evening, and night +- Android +- iOS -- Count: number of times using all_apps/single_app/single_category/multiple_category -- Time of first use: time of first use all_apps/single_app/single_category/multiple_category in minutes -- Time of last use: time of last use all_apps/single_app/single_category/multiple_category in minutes -- Frenquency entropy: the entropy of the apps frequency for all_apps/single_app/single_category/multiple_category. There is no entropy for single_app. +**Snakefile entry:** -.. _battery: +.. - Download raw Accelerometer dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +.. - Apply readable datetime to Accelerometer dataset: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +- Extract Calls Metrics + + | ``expand("data/processed/{pid}/accelerometer_{day_segment}.csv",`` + | ``pid=config["PIDS"],`` + | ``day_segment = config["ACCELEROMETER"]["DAY_SEGMENTS"]),`` + +**Rule chain:** + +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. + + - **Script:** ``src/data/download_dataset.R`` - See the download_dataset.R_ script. + +- **Rule:** ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. + + - **Script:** ``src/data/readable_datetime.R`` - See the readable_datetime.R_ script. + +- **Rule:** ``rules/features.snakefile/accelerometer_metrics`` - See the accelerometer_metrics_ rule. + + - **Script:** ``src/features/accelerometer_metrics.py`` - See the accelerometer_metrics.py_ script. + + +.. _Accelerometer-parameters: + +**Accelerometer Rule Parameters:** + +============ =================== +Name Description +============ =================== +day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night`` +metrics The different measures that can be retrieved from the dataset. See :ref:`Available Accelerometer Metrics ` Table below +============ =================== + +.. _accelerometer-available-metrics: + +**Available Accelerometer Metrics** + +The following table shows a list of the available metrics the accelerometer sensor data for a particular ``day_segment``. + +==================================== ============== ============= +Name Units Description +==================================== ============== ============= +maxmagnitude m/s\ :sup:`2` The maximum magnitude of acceleration (:math:`\|acceleration\| = \sqrt{x^2 + y^2 + z^2}`). +minmagnitude m/s\ :sup:`2` The minimum magnitude of acceleration. +avgmagnitude m/s\ :sup:`2` The average magnitude of acceleration. +medianmagnitude m/s\ :sup:`2` The median magnitude of acceleration. +stdmagnitude m/s\ :sup:`2` The standard deviation of acceleration. +ratioexertionalactivityepisodes The ratio of exertional activity time periods to total time periods. +sumexertionalactivityepisodes minutes The total minutes of performing exertional activity during the epoch +longestexertionalactivityepisode minutes The longest episode of performing exertional activity +longestnonexertionalactivityepisode minutes The longest episode of performing non-exertional activity +countexertionalactivityepisodes episodes The count of the episodes of performing exertional activity +countnonexertionalactivityepisodes episodes The count of the episodes of performing non-exertional activity +==================================== ============== ============= + +**Assumptions/Observations:** N/A + + + +.. _applications-foreground-sensor-doc: + +Applications Foreground +"""""""""""""""""""""""" + +See `Applications Foreground Config Code`_ + +**Available Epochs:** + +- daily +- morning +- afternoon +- evening +- night + +**Available Platforms:** + +- Android +- iOS + +**Snakefile entry:** + +.. - Download raw Applications Foreground dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +.. - Apply readable dateime Applications Foreground dataset: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +.. - Genre categorization of Applications Foreground dataset: ``expand("data/interim/{pid}/applications_foreground_with_datetime_with_genre.csv", pid=config["PIDS"]),`` + +- Extract Applications Foreground Metrics: + + | ``expand("data/processed/{pid}/applications_foreground_{day_segment}.csv",`` + | ``pid=config["PIDS"],`` + | ``day_segment = config["APPLICATIONS_FOREGROUND"]["DAY_SEGMENTS"]),`` + +**Rule Chain:** + +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. + + - **Script:** ``src/data/download_dataset.R`` - See the download_dataset.R_ script. + +- **Rule:** ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. + + - **Script:** ``src/data/readable_datetime.R`` - See the readable_datetime.R_ script. + +- **Rule:** ``rules/preprocessing.snakefile/application_genres`` - See the application_genres_ rule + + - **Script:** ``../src/data/application_genres.R`` - See the application_genres.R_ script + +- **Rule:** ``rules/features.snakefile/applications_foreground_metrics`` - See the applications_foreground_metrics_ rule. + + - **Script:** ``src/features/applications_foreground_metrics.py`` - See the applications_foreground_metrics.py_ script. + +.. _applications-foreground-parameters: + +**Applications Foreground Rule Parameters:** + +==================== =================== +Name Description +==================== =================== +day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night`` +single_categories A single category of apps that will be included for the data collection. The available categories can be defined in the ``APPLICATION_GENRES`` in the ``config`` file. See :ref:`Assumtions and Observations `. +multiple_categories Categories of apps that will be included for the data collection. The available categories can be defined in the ``APPLICATION_GENRES`` in the ``config`` file. See :ref:`Assumtions and Observations `. +single_apps Any Android app can be included in the list of apps used to collect data by adding the package name to this list. (E.g. Youtube) +excluded_categories Categories of apps that will be excluded for the data collection. The available categories can be defined in the ``APPLICATION_GENRES`` in the ``config`` file. See :ref:`Assumtions and Observations `. +excluded_apps Any Android app can be excluded from the list of apps used to collect data by adding the package name to this list. +metrics The different measures that can be retrieved from the dataset. See :ref:`Available Applications Foreground Metrics ` Table below +==================== =================== + +.. _applications-foreground-available-metrics: + +**Available Applications Foreground Metrics** + +The following table shows a list of the available metrics for the Applications Foreground dataset + +================== ========= ============= +Name Units Description +================== ========= ============= +count apps A count number of times using ``all_apps``, ``single_app``, ``single_category`` apps or ``multiple_category`` apps. +timeoffirstuse contacts The time in minutes from 12:00am (Midnight) to first use of any app (i.e. ``all_apps``), ``single_app``, ``single_category`` apps or ``multiple_category`` apps. +timeoflastuse minutes The time in minutes from 12:00am (Midnight) to the last of use of any app (i.e. ``all_apps``), ``single_app``, ``single_category`` apps or ``multiple_category`` apps. +frequencyentropy shannons The entropy of the apps frequency for ``all_apps``, ``single_category`` apps or ``multiple_category`` apps. There is no entropy for ``single_app`` apos. +================== ========= ============= + +.. _applications-foreground-observations: + +**Assumptions/Observations:** + +The ``APPLICATION_GENRES`` configuration (See `Application Genres Config`_ setting defines that catalogue of categories of apps that available for the pipeline. The ``CATALOGUE_SOURCE`` defines the source of the catalogue which can be ``FILE`` i.e. a custom file like the file provided with this project (See `Custom Catalogue File`_) or ``GOOGLE`` which is category classifications provided by Google. The ``CATALOGUE_FILE`` variable defines the path to the location of the custom file that contains the custom app catalogue. If ``CATALOGUE_SOURCE`` is equal to ``FILE``, the ``UPDATE_CATALOGUE_FILE`` variable specifies (``TRUE`` or ``FALSE``) whether or not to update ``CATALOGUE_FILE``, if ``CATALOGUE_SOURCE`` is equal to ``GOOGLE`` all scraped genres will be saved to ``CATALOGUE_FILE``. The ``SCRAPE_MISSING_GENRES`` is a ``TRUE`` or ``FALSE`` variable that specifies whether or not to scrape missing genres, only effective if ``CATALOGUE_SOURCE`` is equal to ``FILE``. If ``CATALOGUE_SOURCE`` is equal to ``GOOGLE``, all genres are scraped anyway. It should be noted that the ``top1global`` option finds and uses the most used app for that participant for the study. + + + +.. _battery-sensor-doc: Battery --------- +""""""""" -Available epochs: daily, morning, afternoon, evening, and night +See `Battery Config Code`_ -- Count discharge: number of battery discharging episodes -- Sum duration discharge: total duration of all discharging episodes (time the phone was discharging) -- Average consumption rate: average of the ratios between discharging episodes’ battery delta and duration -- Max consumption rate: max of the ratios between discharging episodes’ battery delta and duration -- Count charge: number of battery charging episodes -- Sum duration charge: total duration of all charging episodes (time the phone was charging) +**Available Epochs:** -.. _google-activity-recognition: +- daily +- morning +- afternoon +- evening +- night + +**Available Platforms:** + +- Android +- iOS + +**Snakefile entry:** + +.. - Download raw Battery dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +.. - Apply readable dateime to Battery dataset: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +.. - Extract the deltas in Battery charge : ``expand("data/processed/{pid}/battery_deltas.csv", pid=config["PIDS"]),`` + +- Extract Battery Metrics: + + | ``expand("data/processed/{pid}/battery_{day_segment}.csv",`` + | ``pid=config["PIDS"],`` + | ``day_segment = config["BATTERY"]["DAY_SEGMENTS"]),`` + +**Rule Chain:** + +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. + + - **Script:** ``src/data/download_dataset.R`` - See the download_dataset.R_ script. + +- **Rule:** ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. + + - **Script:** ``src/data/readable_datetime.R`` - See the readable_datetime.R_ script. + +- **Rule:** ``rules/features.snakefile/battery_deltas`` - See the battery_deltas_ rule. + + - **Script:** ``src/features/battery_deltas.R`` - See the battery_deltas.R_ script. + +- **Rule:** ``rules/features.snakefile/battery_metrics`` - See the battery_metrics_ rule + + - **Script:** ``src/features/battery_metrics.py`` - See the battery_metrics.py_ script. + +.. _battery-parameters: + +**Battery Rule Parameters:** + +============ =================== +Name Description +============ =================== +day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night`` +metrics The different measures that can be retrieved from the Battery dataset. See :ref:`Available Battery Metrics ` Table below +============ =================== + +.. _battery-available-metrics: + +**Available Battery Metrics** + +The following table shows a list of the available metrics for Battery data. + +===================== =============== ============= +Name Units Description +===================== =============== ============= +countdischarge episodes A count of the number of battery discharging episodes +sumdurationdischarge hours The total duration of all discharging episodes (time the phone was discharging) +countcharge episodes A count of the number of battery charging episodes +sumdurationcharge hours The total duration of all charging episodes (time the phone was charging) +avgconsumptionrate episodes/hours The average of the ratios between discharging episodes’ battery delta and duration +maxconsumptionrate episodes/hours The maximum of the ratios between discharging episodes’ battery delta and duration +===================== =============== ============= + +**Assumptions/Observations:** + + +.. _google-activity-recognition-sensor-doc: Google Activity Recognition ---------------------------- +"""""""""""""""""""""""""""" -Available epochs: daily, morning, afternoon, evening, and night +See `Google Activity Recognition Config Code`_ -- Count (number of rows) -- Most common activity -- Number of unique activities -- Activity change count (any transition between two different activities, sitting to running for example) -- Sum stationary: total duration of episodes of still and tilting (phone) activities -- Sum mobile: total duration of episodes of on foot, running, and on bicycle activities -- Sum vehicle: total duration of episodes of on vehicle activity +**Available Epochs:** -.. _light: +- daily +- morning +- afternoon +- evening +- night + +**Available Platforms:** + +- Android + +**Snakefile entry:** + +.. - Download raw Google Activity Recognition dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +.. - Apply readable dateime to Google Activity Recognition dataset: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +.. - Extract the deltas in Google Activity Recognition dataset: ``expand("data/processed/{pid}/plugin_google_activity_recognition_deltas.csv", pid=config["PIDS"]),`` + +- Extract Sensor Metrics: + + | ``expand("data/processed/{pid}/google_activity_recognition_{segment}.csv",pid=config["PIDS"],`` + | ``segment = config["GOOGLE_ACTIVITY_RECOGNITION"]["DAY_SEGMENTS"]),`` + +**Rule Chain:** + +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. + + - **Script:** ``src/data/download_dataset.R`` - See the download_dataset.R_ script. + +- **Rule:** ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. + + - **Script:** ``src/data/readable_datetime.R`` - See the readable_datetime.R_ script. + +- **Rule:** ``rules/features.snakefile/google_activity_recognition_deltas`` - See the google_activity_recognition_deltas_ rule. + + - **Script:** ``src/features/google_activity_recognition_deltas.R`` - See the google_activity_recognition_deltas.R_ script. + +- **Rule:** ``rules/features.snakefile/activity_metrics`` - See the activity_metrics_ rule. + + - **Script:** ``ssrc/features/google_activity_recognition.py`` - See the google_activity_recognition.py_ script. + +.. _google-activity-recognition-parameters: + +**Google Activity Recognition Rule Parameters:** + +============ =================== +Name Description +============ =================== +day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night`` +metrics The different measures that can be retrieved from the Google Activity Recognition dataset. See :ref:`Available Google Activity Recognition Metrics ` Table below +============ =================== + +.. _google-activity-recognition-available-metrics: + +**Available Google Activity Recognition Metrics** + +The following table shows a list of the available metrics for the Google Activity Recognition dataset. + +====================== ============ ============= +Name Units Description +====================== ============ ============= +count rows A count of the number of rows of registered activities. +mostcommonactivity The most common activity. +countuniqueactivities activities A count of the number of unique activities. +activitychangecount transitions A count of any transition between two different activities, sitting to running for example. +sumstationary minutes The total duration of episodes of still and tilting (phone) activities. +summobile minutes The total duration of episodes of on foot, running, and on bicycle activities +sumvehicle minutes The total duration of episodes of on vehicle activity +====================== ============ ============= + +**Assumptions/Observations:** N/A + +.. _light-doc: Light ------ +""""""" -Available epochs: daily, morning, afternoon, evening, and night +See `Light Config Code`_ -- Count (number of rows) -- Max lux: maximum ambient luminance in lux units -- Min lux: minimum ambient luminance in lux units -- Avg lux: average ambient luminance in lux units -- median lux: median ambient luminance in lux units -- Std lux: standard deviation of ambient luminance in lux units +**Available Epochs:** -.. _location-features: + - daily + - morning + - afternoon + - evening + - night + +**Available Platforms:** + + - Android + +**Snakefile entry:** + +.. - Download raw Sensor dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +.. - Apply readable dateime to Sensor dataset: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +- Extract Light Metrics: + + | ``expand("data/processed/{pid}/light_{day_segment}.csv",`` + | ``pid=config["PIDS"],`` + | ``day_segment = config["LIGHT"]["DAY_SEGMENTS"]),`` + +**Rule Chain:** + +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. + + - **Script:** ``src/data/download_dataset.R`` - See the download_dataset.R_ script. + +- **Rule:** ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. + + - **Script:** ``src/data/readable_datetime.R`` - See the readable_datetime.R_ script. + +- **Rule:** ``rules/features.snakefile/light_metrics`` - See the light_metrics_ rule. + + - **Script:** ``src/features/light_metrics.py`` - See the light_metrics.py_ script. + +.. _light-parameters: + +**Light Rule Parameters:** + +============ =================== +Name Description +============ =================== +day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night`` +metrics The different measures that can be retrieved from the Light dataset. See :ref:`Available Light Metrics ` Table below +============ =================== + +.. _light-available-metrics: + +**Available Light Metrics** + +The following table shows a list of the available metrics for the Light dataset. + +=========== ========= ============= +Name Units Description +=========== ========= ============= +count rows A count of the number of rows that light sensor recorded. +maxlux lux The maximum ambient luminance in lux units +minlux lux The minimum ambient luminance in lux units +avglux lux The average ambient luminance in lux units +medianlux lux The median ambient luminance in lux units +stdlux lux The standard deviation of ambient luminance in lux units +=========== ========= ============= + +**Assumptions/Observations:** N/A + + +.. _location-sensor-doc: Location (Barnett’s) Features ------------------------------ +"""""""""""""""""""""""""""""" +Barnett’s location features are based on the concept of flights and pauses. GPS coordinates are converted into a +sequence of flights (straight line movements) and pauses (time spent stationary). Data is imputed before metrics +are computed (https://arxiv.org/abs/1606.06328) -Available epochs: daily +See `Location (Barnett’s) Config Code`_ -Barnett’s location features are based on the concept of flights and pauses. GPS coordinates are converted into a sequence of flights (straight line movements) and pauses (time spent stationary). Data is imputed before metrics are computed (https://arxiv.org/abs/1606.06328) +**Available Epochs:** -- Time at home. Time spent at home in minutes. Home is the most visited significant location between 8 pm and 8 am including any pauses within a 200-meter radius. -- Max home distance. Maximum distance from home in meters. -- Pause probability. The fraction of a day spent in a pause (as opposed to a flight) -- Circadian routine. A continuous metric that can take any value between 0 and 1, where 0 represents a daily routine completely different from any other sensed days and 1 a routine the same as every other sensed day. -- Wkn circadian routine. Same as Circadian routine but computed separately for weekends and weekdays. -- Distance travelled. Total distance travelled over a day. -- Radius of Gyration (RoG). It is a measure in meters of the area covered by a person over a day. A centroid is calculated for all the places (pauses) visited during a day and a weighted distance between all places and the centroid is computed. The weights are proportional to the time spent in each place. -- Maximum diameter. Largest distance in meters between any two pauses. -- Avg flight duration. Mean duration of all flights. -- Avg flight length. Mean length of all flights -- Std flight duration. The standard deviation of the duration of all flights. -- Std flight length. The standard deviation of the length of all flights. -- Significant locations. The number of significant locations visited during the day. Significant locations are computed using k-means clustering over pauses found in the whole monitoring period. The number of clusters is found iterating from 1 to 200 stopping until the centroids of two significant locations are within 400 meters of one another. -- Significant location entropy. Entropy measurement based on the proportion of time spent at each significant location visited during a day. + - daily -.. _screen: +**Available Platforms:** + + - Android + - iOS + +**Snakefile entry:** + +.. - Download raw Sensor dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +.. - Apply readable dateime to Sensor dataset: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + +- Extract Sensor Metrics: ``expand("data/processed/{pid}/location_barnett.csv", pid=config["PIDS"]),`` + +**Rule Chain:** + +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. + + - **Script:** ``src/data/download_dataset.R`` - See the download_dataset.R_ script. + +- **Rule:** ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. + + - **Script:** ``src/data/readable_datetime.R`` - See the readable_datetime.R_ script. + +- **Rule:** ``rules/preprocessing.snakefile/phone_sensed_bins`` - See the phone_sensed_bins_ rule. + + - **Script:** ``src/data/phone_sensed_bins.R`` - See the phone_sensed_bins.R_ script. + +- **Rule:** ``rules/preprocessing.snakefile/resample_fused_location`` - See the resample_fused_location_ rule. + + - **Script:** ``src/data/resample_fused_location.R`` - See the resample_fused_location.R_ script. + +- **Rule:** ``rules/features.snakefile/location_barnett_metrics`` - See the location_barnett_metrics_ rule. + + - **Script:** ``src/features/location_barnett_metrics.R`` - See the location_barnett_metrics.R_ script. + + +.. _location-parameters: + +**Location Rule Parameters:** + +================= =================== +Name Description +================= =================== +location_to_use The specifies which of the location data will be use in the analysis. Possible options are ``ALL``, ``ALL_EXCEPT_FUSED`` OR ``RESAMPLE_FUSED`` +accuracy_limit This is in meters. The sensor drops location coordinates with an accuracy higher than this. This number means there's a 68% probability the true location is within this radius specified. +timezone The timezone used to calculate location. +metrics The different measures that can be retrieved from the Location dataset. See :ref:`Available Location Metrics ` Table below +================= =================== + +.. _location-available-metrics: + +**Available Location Metrics** + +The following table shows a list of the available metrics for Location dataset. + +================ ========= ============= +Name Units Description +================ ========= ============= +hometime minutes Time at home. Time spent at home in minutes. Home is the most visited significant location between 8 pm and 8 am including any pauses within a 200-meter radius. +disttravelled meters Distance travelled. This is total distance travelled over a day. +rog meters The Radius of Gyration (RoG). It is a measure in meters of the area covered by a person over a day. A centroid is calculated for all the places (pauses) visited during a day and a weighted distance between all the places and the centroid is computed. The weights are proportional to the time spent in each place. +maxdiam meters The Maximum diameter. The largest distance in meters between any two pauses. +maxhomedist meters Max home distance. The maximum distance from home in meters. +siglocsvisited locations Significant locations. The number of significant locations visited during the day. Significant locations are computed using k-means clustering over pauses found in the whole monitoring period. The number of clusters is found iterating from 1 to 200 stopping until the centroids of two significant locations are within 400 meters of one another. +avgflightlen meters Avg flight length. Mean length of all flights +stdflightlen meters Std flight length. The standard deviation of the length of all flights. +avgflightdur meters Avg flight duration. Mean duration of all flights. +stdflightdur meters Std flight duration. The standard deviation of the duration of all flights. +probpause Pause probability. The fraction of a day spent in a pause (as opposed to a flight) +siglocentropy Significant location entropy. Entropy measurement based on the proportion of time spent at each significant location visited during a day. +minsmissing +circdnrtn Circadian routine. A continuous metric that can take any value between 0 and 1, where 0 represents a daily routine completely different from any other sensed days and 1 a routine the same as every other sensed day. +wkenddayrtn Weekend circadian routine. Same as Circadian routine but computed separately for weekends and weekdays. +================ ========= ============= + +**Assumptions/Observations:** + +*Significant Locations Identified* + +(i.e. The clustering method used) +Significant locations are determined using K-means clustering on locations that a patient visit over the course of the period of data collection. By setting K=K+1 and repeat clustering until two significant locations are within 100 meters of one another, the results from the previous step (K-1) can be used as the total number of significant locations. See `Beiwe Summary Statistics`_. + +*Definition of Stationarity* + +(i.e., The length of time a person have to be not moving to qualify) +This is based on a Pause-Flight model, The parameters used is a minimum pause duration of 300sec and a minimum pause distance of 60m. See the `Pause-Flight Model`_. + +*The Circadian Calculation* + +For a detailed description of how this measure is calculated, see Canzian and Musolesi's 2015 paper in the Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, titled "Trajectories of depression: unobtrusive monitoring of depressive states by means of smartphone mobility traces analysis." Their procedure was followed using 30-min increments as a bin size. See `Beiwe Summary Statistics`_. + +.. _screen-sensor-doc: Screen ------- +"""""""" -Available epochs: daily, morning, afternoon, evening, and night +See `Screen Config Code`_ -Notes. An unlock episode is considered as the time between an unlock event and a lock event. iOS recorded these episodes reliable (albeit duplicated lock events within milliseconds from each other). However, in Android there are multiple consecutive unlock/lock events so we keep the closest pair. This happens because ACTION_SCREEN_OFF and ON are "sent when the device becomes non-interactive which may have nothing to do with the screen turning off" see this link +**Available Epochs:** -- Count on: count of screen on events (only available for Android) -- Count unlock: count of screen unlock events -- Diff count on off: For debug purposes, on and off events should come in pairs, difference should be close to zero then. -- Diff count unlock lock, For debug purposes, unlock and lock events should come in pairs, difference should be close to zero then. -- Sum duration unlock: sum duration of unlock episodes -- Max duration unlock: maximum duration of unlock episodes -- Min duration unlock: minimum duration of unlock episodes -- Average duration unlock: average duration of unlock episodes -- Std duration unlock: standard deviation of the duration of unlock episodes + - daily + - morning + - afternoon + - evening + - night -.. _fitbit-heart-rate: +**Available Platforms:** -Fitbit: heart rate ------------------- + - Android + - iOS -Available epochs: daily, morning, afternoon, evening, and night +**Snakefile entry:** -Notes. eart rate zones contain 4 zones: out_of_range zone, fat_burn zone, cardio zone, and peak zone. Please refer to the [Fitbit documentation](https://help.fitbit.com/articles/en_US/Help_article/1565) for the detailed informations of how to define those zones. +.. - Download raw Screen dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + - Apply readable dateime to Screen dataset: ``expand("data/raw/{pid}/{sensor}_with_datetime.csv", pid=config["PIDS"], sensor=config["SENSORS"]),`` + - Extract the deltas from the Screen dataset: expand("data/processed/{pid}/screen_deltas.csv", pid=config["PIDS"]), + +- Extract Screen Metrics: + + | ``expand("data/processed/{pid}/screen_{day_segment}.csv",`` + | ``pid=config["PIDS"],`` + | ``day_segment = config["SCREEN"]["DAY_SEGMENTS"]),`` + +**Rule Chain:** -- Max hr: maximum heart rate -- Min hr: minimum heart rate -- Avg hr: average heart rate -- Median hr: median heart rate -- Mode hr: mode heart rate -- Std hr: standard deviation of heart rate -- Diff max mode hr: maximum heart rate minus mode heart rate -- Diff min mode hr: mode heart rate minus minimum heart rate -- Entropy hr: entropy of heart rate -- Length out of range: duration of heart rate in out_of_range zone in minute -- Length fat burn: duration of heart rate in fat_burn zone in minute -- Length cardio: duration of heart rate in cardio zone in minute -- Length peak: duration of heart rate in peak zone in minute +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. -.. _fitbit-steps: + - **Script:** ``src/data/download_dataset.R`` - See the download_dataset.R_ script. -Fitbit: steps -------------- +- **Rule:** ``rules/preprocessing.snakefile/readable_datetime`` - See the readable_datetime_ rule. -Available epochs: daily, morning, afternoon, evening, and night + - **Script:** ``src/data/readable_datetime.R`` - See the readable_datetime.R_ script. -Notes. If the step count per minute smaller than the THRESHOLD_ACTIVE_BOUT (default value is 10), it is defined as sedentary status. Otherwise, it is defined as active status. One active/sedentary bout is a period during with the user is under active/sedentary status. +- **Rule:** ``rules/features.snakefile/screen_deltas`` - See the screen_deltas_ rule. -- Sum all steps: total step count -- Max all steps: maximum step count -- Min all steps: minimum step count -- Avg all steps: average step count -- Std all steps: standard deviation of step count -- Count sedentary bout: count of sedentary bouts -- Max duration sedentary bout: maximum duration of sedentary bouts -- Min duration sedentary bout: minimum duration of sedentary bouts -- Avg duration sedentary bout: average duration of sedentary bouts -- Std duration sedentary bout: standard deviation of the duration of sedentary bouts -- Count active bout: count of active bouts -- Max duration active bout: maximum duration of active bouts -- Min duration active bout: minimum duration of active bouts -- Avg duration active bout: average duration of active bouts -- Std duration active bout: standard deviation of the duration of active bouts + - **Script:** ``src/features/screen_deltas.R`` - See the screen_deltas.R_ script. +- **Rule:** ``rules/features.snakefile/screen_metrics`` - See the screen_metrics_ rule. + - **Script:** ``src/features/screen_metrics.py`` - See the screen_metrics.py_ script. + +.. _screen-parameters: + +**Screen Rule Parameters:** + +=============== =================== +Name Description +=============== =================== +day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night`` +metrics_events The different measures that can be retrieved from the events in the Screen dataset. See :ref:`Available Screen Events Metrics ` Table below +metrics_deltas The different measures that can be retrieved from the episodes extracted from the Screen dataset. See :ref:`Available Screen Episodes Metrics ` Table below +episodes The action that defines an episode +=============== =================== + +.. _screen-events-available-metrics: + +.. + **Available Screen Events Metrics** + The following table shows a list of the available metrics for Screen Events. + ================= ============== ============= + Name Units Description + ================= ============== ============= + counton `ON` events Count on: A count of screen `ON` events (only available for Android) + countunlock Unlock events Count unlock: A count of screen unlock events. + unlocksperminute Unlock events Unlock events per minute: The average of the number of unlock events that occur in a minute + ================= ============== ============= + +.. _screen-episodes-available-metrics: + +**Available Screen Episodes Metrics** + +The following table shows a list of the available metrics for Screen Episodes. + +============= ========= ============= +Name Units Description +============= ========= ============= +sumduration seconds Sum duration unlock: The sum duration of unlock episodes +maxduration seconds Max duration unlock: The maximum duration of unlock episodes +minduration seconds Min duration unlock: The minimum duration of unlock episodes +avgduration seconds Average duration unlock: The average duration of unlock episodes +stdduration seconds Std duration unlock: The standard deviation of the duration of unlock episodes +============= ========= ============= + +**Assumptions/Observations:** + +An ``unlock`` episode is considered as the time between an ``unlock`` event and a ``lock`` event. iOS recorded these episodes reliable (albeit some duplicated ``lock`` events within milliseconds from each other). However, in Android there are some events unrelated to the screen state because of multiple consecutive ``unlock``/``lock`` events, so we keep the closest pair. In the experiments these are less than 10% of the screen events collected. This happens because ``ACTION_SCREEN_OFF`` and ``ON`` are "sent when the device becomes non-interactive which may have nothing to do with the screen turning off". Additionally in Android it is possible to measure the time spent on the ``lock`` screen onto the ``unlock`` event and the total screen time (i.e. ``ON`` to ``OFF``) events but we are only keeping ``unlock`` episodes (``unlock`` to ``OFF``) to be consistent with iOS. + +.. _fitbit-heart-rate-sensor-doc: + +Fitbit: Heart Rate +""""""""""""""""""" + +See `Fitbit: Heart Rate Config Code`_ + +**Available Epochs:** + + - daily + - morning + - afternoon + - evening + - night + +**Available Platforms:** + + - Fitbit + +**Snakefile entry:** + +.. - Download raw Fitbit: Heart Rate dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["FITBIT_TABLE"]),`` + +.. - Apply readable datetime to Fitbit: Heart Rate dataset: + +.. + | ``expand("data/raw/{pid}/fitbit_{fitbit_sensor}_with_datetime.csv",`` + | ``pid=config["PIDS"],`` + | ``fitbit_sensor=config["FITBIT_SENSORS"]),`` + +- Extract Sensor Metrics: + + | ``expand("data/processed/{pid}/fitbit_heartrate_{day_segment}.csv",`` + | ``pid=config["PIDS"],`` + | ``day_segment = config["HEARTRATE"]["DAY_SEGMENTS"]),`` + +**Rule Chain:** + +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. + + - **Script:** ``src/data/download_dataset.R`` - See the download_dataset.R_ script. + +- **Rule:** ``rules/preprocessing.snakefile/fitbit_with_datetime`` - See the fitbit_with_datetime_ rule. + + - **Script:** ``src/data/fitbit_readable_datetime.py`` - See the fitbit_readable_datetime.py_ script. + +- **Rule:** ``rules/features.snakefile/fitbit_heartrate_metrics`` - See the fitbit_heartrate_metrics_ rule. + + - **Script:** ``src/features/fitbit_heartrate_metrics.py`` - See the fitbit_heartrate_metrics.py_ script. + + +.. _fitbit-heart-rate-parameters: + +**Fitbit: Heart Rate Rule Parameters:** + +============ =================== +Name Description +============ =================== +day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night`` +metrics The different measures that can be retrieved from the Fitbit: Heart Rate dataset. + See :ref:`Available Fitbit: Heart Rate Metrics ` Table below +============ =================== + +.. _fitbit-heart-rate-available-metrics: + +**Available Fitbit: Heart Rate Metrics** + +The following table shows a list of the available metrics for the Fitbit: Heart Rate dataset. + +================== =========== ============= +Name Units Description +================== =========== ============= +maxhr beats/mins The maximum heart rate. +minhr beats/mins The minimum heart rate. +avghr beats/mins The average heart rate. +medianhr beats/mins The median heart rate. +modehr beats/mins The mode heart rate. +stdhr beats/mins The standard deviation of heart rate. +diffmaxmodehr beats/mins Diff max mode heart rate: The maximum heart rate minus mode heart rate. +diffminmodehr beats/mins Diff min mode heart rate: The mode heart rate minus minimum heart rate. +entropyhr Entropy heart rate: The entropy of heart rate. +lengthoutofrange minutes Length out of range: The duration of time the heart rate is in the ``out_of_range`` zone in minute. +lengthfatburn minutes Length fat burn: The duration of time the heart rate is in the ``fat_burn`` zone in minute. +lengthcardio minutes Length cardio: The duration of time the heart rate is in the ``cardio`` zone in minute. +lengthpeak minutes Length peak: The duration of time the heart rate is in the ``peak`` zone in minute +================== =========== ============= + +**Assumptions/Observations:** Heart rate zones contain 4 zones: ``out_of_range`` zone, ``fat_burn`` zone, ``cardio`` zone, and ``peak`` zone. Please refer to the `Fitbit documentation`_ for detailed information of how to define those zones. + +.. _fitbit-steps-sensor-doc: + +Fitbit: Steps +""""""""""""""" + +See `Fitbit: Steps Config Code`_ + +**Available Epochs:** + + - daily + - morning + - afternoon + - evening + - night + +**Available Platforms:** + + - Fitbit + +**Snakefile entry:** + +.. - Download raw Fitbit: Steps dataset: ``expand("data/raw/{pid}/{sensor}_raw.csv", pid=config["PIDS"], sensor=config["FITBIT_TABLE"]),`` + +.. + - Apply readable datetime to Fitbit: Steps dataset: + | ``expand("data/raw/{pid}/fitbit_{fitbit_sensor}_with_datetime.csv",`` + | ``pid=config["PIDS"],`` + | ``fitbit_sensor=config["FITBIT_SENSORS"]),`` + +- Extract Fitbit: Steps Metrics: + + | ``expand("data/processed/{pid}/fitbit_step_{day_segment}.csv",`` + | ``pid=config["PIDS"],`` + | ``day_segment = config["STEP"]["DAY_SEGMENTS"]),`` + +**Rule Chain:** + +- **Rule:** ``rules/preprocessing.snakefile/download_dataset`` - See the download_dataset_ rule. + + - **Script:** ``src/data/download_dataset.R`` - See the download_dataset.R_ script. + +- **Rule:** ``rules/preprocessing.snakefile/fitbit_with_datetime`` - See the fitbit_with_datetime_ rule. + + - **Script:** ``src/data/fitbit_readable_datetime.py`` - See the fitbit_readable_datetime.py_ script. + +- **Rule:** ``rules/features.snakefile/fitbit_step_metrics`` - See the fitbit_step_metrics.py_ rule. + + - **Script:** ``src/features/fitbit_step_metrics.py`` - See the fitbit_step_metrics.py_ script. + + +.. _fitbit-steps-parameters: + +**Fitbit: Steps Rule Parameters:** + +======================= =================== +Name Description +======================= =================== +day_segment The particular ``day_segments`` that will be analyzed. The available options are ``daily``, ``morning``, ``afternoon``, ``evening``, ``night`` +metrics The different measures that can be retrieved from the dataset. See :ref:`Available Fitbit: Steps Metrics ` Table below +threshold_active_bout The maximum number of steps per minute necessary for a bout to be ``sedentary``. That is, if the step count per minute is greater than this value the bout has a status of ``active``. +======================= =================== + +.. _fitbit-steps-available-metrics: + +**Available Fitbit: Steps Metrics** + +The following table shows a list of the available metrics for the Fitbit: Steps dataset. + +========================= ========= ============= +Name Units Description +========================= ========= ============= +sumallsteps steps Sum all steps: The total step count. +maxallsteps steps Max all steps: The maximum step count +minallsteps steps Min all steps: The minimum step count +avgallsteps steps Avg all steps: The average step count +stdallsteps steps Std all steps: The standard deviation of step count +countsedentarybout bouts Count sedentary bout: A count of sedentary bouts +maxdurationsedentarybout minutes Max duration sedentary bout: The maximum duration of sedentary bouts +mindurationsedentarybout minutes Min duration sedentary bout: The minimum duration of sedentary bouts +avgdurationsedentarybout minutes Avg duration sedentary bout: The average duration of sedentary bouts +stddurationsedentarybout minutes Std duration sedentary bout: The standard deviation of the duration of sedentary bouts +countactivebout bouts Count active bout: A count of active bouts +maxdurationactivebout minutes Max duration active bout: The maximum duration of active bouts +mindurationactivebout minutes Min duration active bout: The minimum duration of active bouts +avgdurationactivebout minutes Avg duration active bout: The average duration of active bouts +stddurationactivebout minutes Std duration active bout: The standard deviation of the duration of active bouts +========================= ========= ============= + +**Assumptions/Observations:** If the step count per minute smaller than the ``THRESHOLD_ACTIVE_BOUT`` (default value is 10), it is defined as sedentary status. Otherwise, it is defined as active status. One active/sedentary bout is a period during with the user is under ``active``/``sedentary`` status. + + +.. -------------------------Links ------------------------------------ .. .. _SENSORS: https://github.com/carissalow/rapids/blob/f22d1834ee24ab3bcbf051bc3cc663903d822084/config.yaml#L2 .. _`SMS Config Code`: https://github.com/carissalow/rapids/blob/f22d1834ee24ab3bcbf051bc3cc663903d822084/config.yaml#L38 @@ -522,3 +1143,50 @@ Notes. If the step count per minute smaller than the THRESHOLD_ACTIVE_BOUT (defa .. _`Bluetooth Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L76 .. _bluetooth_metric: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L63 .. _bluetooth_metrics.R: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/src/features/bluetooth_metrics.R +.. _`Accelerometer Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L98 +.. _accelerometer_metrics: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L124 +.. _accelerometer_metrics.py: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/src/features/accelerometer_metrics.py +.. _`Applications Foreground Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L102 +.. _`Application Genres Config`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L54 +.. _application_genres: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/preprocessing.snakefile#L81 +.. _application_genres.R: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/src/data/application_genres.R +.. _applications_foreground_metrics: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L135 +.. _applications_foreground_metrics.py: https://github.com/carissalow/rapids/blob/master/src/features/accelerometer_metrics.py +.. _`Battery Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L84 +.. _battery_deltas: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L25 +.. _battery_deltas.R: https://github.com/carissalow/rapids/blob/master/src/features/battery_deltas.R +.. _battery_metrics: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L86 +.. _battery_metrics.py : https://github.com/carissalow/rapids/blob/master/src/features/battery_metrics.py +.. _`Google Activity Recognition Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L80 +.. _google_activity_recognition_deltas: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L41 +.. _google_activity_recognition_deltas.R: https://github.com/carissalow/rapids/blob/master/src/features/google_activity_recognition_deltas.R +.. _activity_metrics: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L74 +.. _google_activity_recognition.py: https://github.com/carissalow/rapids/blob/master/src/features/google_activity_recognition.py +.. _`Light Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L94 +.. _light_metrics: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L113 +.. _light_metrics.py: https://github.com/carissalow/rapids/blob/master/src/features/light_metrics.py +.. _`Location (Barnett’s) Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L70 +.. _phone_sensed_bins: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/preprocessing.snakefile#L46 +.. _phone_sensed_bins.R: https://github.com/carissalow/rapids/blob/master/src/data/phone_sensed_bins.R +.. _resample_fused_location: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/preprocessing.snakefile#L67 +.. _resample_fused_location.R: https://github.com/carissalow/rapids/blob/master/src/data/resample_fused_location.R +.. _location_barnett_metrics: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L49 +.. _location_barnett_metrics.R: https://github.com/carissalow/rapids/blob/master/src/features/location_barnett_metrics.R +.. _`Screen Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L88 +.. _screen_deltas: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L33 +.. _screen_deltas.R: https://github.com/carissalow/rapids/blob/master/src/features/screen_deltas.R +.. _screen_metrics: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L97 +.. _screen_metrics.py: https://github.com/carissalow/rapids/blob/master/src/features/screen_metrics.py +.. _`Fitbit: Heart Rate Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L113 +.. _fitbit_with_datetime: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/preprocessing.snakefile#L94 +.. _fitbit_readable_datetime.py: https://github.com/carissalow/rapids/blob/master/src/data/fitbit_readable_datetime.py +.. _fitbit_heartrate_metrics: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L151 +.. _fitbit_heartrate_metrics.py: https://github.com/carissalow/rapids/blob/master/src/features/fitbit_heartrate_metrics.py +.. _`Fitbit: Steps Config Code`: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L117 +.. _fitbit_step_metrics: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/rules/features.snakefile#L162 +.. _fitbit_step_metrics.py: https://github.com/carissalow/rapids/blob/master/src/features/fitbit_step_metrics.py +.. _`Fitbit documentation`: https://help.fitbit.com/articles/en_US/Help_article/1565 +.. _`Custom Catalogue File`: https://github.com/carissalow/rapids/blob/master/data/external/stachl_application_genre_catalogue.csv +.. _top1global: https://github.com/carissalow/rapids/blob/765bb462636d5029a05f54d4c558487e3786b90b/config.yaml#L108 +.. _`Beiwe Summary Statistics`: http://wiki.beiwe.org/wiki/Summary_Statistics +.. _`Pause-Flight Model`: https://academic.oup.com/biostatistics/advance-article/doi/10.1093/biostatistics/kxy059/5145908 \ No newline at end of file diff --git a/docs/usage/introduction.rst b/docs/usage/introduction.rst index b54afb5e..d897c968 100644 --- a/docs/usage/introduction.rst +++ b/docs/usage/introduction.rst @@ -9,20 +9,19 @@ We recommend reading Snakemake_ docs, but the main idea behind the pipeline is t Available features: -- :ref:`accelerometer` -- :ref:`applications_foreground` -- :ref:`battery` +- :ref:`accelerometer-sensor-doc` +- :ref:`applications-foreground-sensor-doc` +- :ref:`battery-sensor-doc` - :ref:`bluetooth-sensor-doc` - :ref:`call-sensor-doc` -- :ref:`fitbit-heart-rate` -- :ref:`fitbit-steps` -- :ref:`google-activity-recognition` -- :ref:`light` -- :ref:`location-features` -- :ref:`screen` +- :ref:`fitbit-heart-rate-sensor-doc` +- :ref:`fitbit-steps-sensor-doc` +- :ref:`google-activity-recognition-sensor-doc` +- :ref:`light-doc` +- :ref:`location-sensor-doc` +- :ref:`screen-sensor-doc` - :ref:`sms-sensor-doc` - We are updating these docs constantly, but if you think something needs clarification, feel free to reach out or submit a pull request on GitHub.