Commit Graph

721 Commits (notes)

Author SHA1 Message Date
junos 9f5edf1c2b Revert "Add a rule for model baselines."
The example was for a classification rather than regression problem.

This reverts commit 9ab0c8f289.

# Conflicts:
#	rules/models.smk
2022-04-12 16:59:42 +02:00
junos 4ad261fae5 Rename baseline features AGAIN.
Correct other mistakes.
2022-04-12 16:55:01 +02:00
= 74cf4ada1c Cr-feat window length for all empaticas sensors. 2022-04-12 14:00:44 +00:00
junos 9ab0c8f289 Add a rule for model baselines.
Add baselines and helper functions to main models dir.
2022-04-12 14:23:58 +02:00
junos f5688f6154 Add a rule to merge sensor and baseline features.
And select target as before.
2022-04-08 15:42:04 +02:00
junos b1f356c3f7 Extract a function to be used elsewhere. 2022-04-08 15:36:32 +02:00
junos 7ff3dcf5fc Move and rename target variable. 2022-04-06 18:21:09 +02:00
junos 50c0defca7 Select target columns (no parsing necessary). 2022-04-06 18:16:49 +02:00
junos ac86221662 [WIP] Add a rule to parse targets.
Does nothing for now.
2022-04-06 17:47:03 +02:00
junos d326a1b09d Include the constant directly in main.py. 2022-04-05 19:08:43 +02:00
junos 2e545e81f0 Include feature calculations for different scales. 2022-04-05 19:05:34 +02:00
junos cbc8ae4e03 Add necessary checks for empty data frames. 2022-04-05 18:58:09 +02:00
junos 751b04f3f4 Pass scale names to Snakemake correctly. 2022-04-05 18:14:37 +02:00
junos 99245afca3 Try a different approach for preprocessing ESMs.
It is important that this follows generic RAPIDS pattern.
In the subsequent step of calculating features,
there is an expected file and folder structure of data/interim.
See rules/common.smk/find_features_files()
2022-04-05 18:02:31 +02:00
junos ed298a9479 Implement the basic feature extraction steps. 2022-04-05 15:46:02 +02:00
Primoz c050174ca3 Various minimal changes. 2022-03-31 09:16:00 +00:00
Primoz a357138f6e Added CF for HRV and shortened test data 2022-03-30 15:01:24 +00:00
Primoz 470993eeb0 Modification of getSampleRate method for all CF scripts. 2022-03-30 15:00:11 +00:00
junos 3af8de6235 Create feature provider script. 2022-03-30 10:40:53 +02:00
junos 9478dc94f2 Add an else.
This is to make sure that in case the reversing fails, we do not get any output items.
Snakemake will inform us of an error in this event.
2022-03-30 10:40:53 +02:00
= ab0b9227d7 Added ACC calculated features and shorter version of ACC data. 2022-03-29 09:41:51 +00:00
= a9244a60fc Corrections for TEMP cf src script. 2022-03-28 14:26:37 +00:00
= 8b76c96e47 Cleaning existing CF mains' and preparing src script for ACC. 2022-03-28 14:18:29 +00:00
= ca59a54d8f Get a sample rate from two sequential timestamps. 2022-03-28 13:50:08 +00:00
= 393dab72f5 Added components for the temperature features extraction. 2022-03-28 12:37:02 +00:00
Primoz f389ac9d89 Delete CF features folder 2022-03-25 16:24:52 +00:00
Primoz 191e53e543 Added cf provider for EDA feature processing. 2022-03-23 15:13:53 +00:00
Primoz 2da0911d4c Skeleton file main.py for EDA CalcFt. integration. 2022-03-22 12:48:43 +00:00
junos c6144f8403 Reverse JCQ items. 2022-03-16 18:55:46 +01:00
junos 23f0aaba3a Get the name of the questionnaire from Snakefile. 2022-03-16 18:28:57 +01:00
junos 679f00dc19 Enable selecting any questionnaire as target. 2022-03-16 17:55:44 +01:00
junos 1374eda171 Flatten questionnaire ID dict. 2022-03-16 17:38:09 +01:00
junos 19b9da0ba3 Separate function definitions from main. 2022-03-16 16:49:28 +01:00
junos ef57103bac Add questionnaire ID key. 2022-03-15 13:41:33 +01:00
junos 5f293211a7 Reformat. 2022-03-15 13:28:51 +01:00
junos d470eef27e Add a rule to preprocess and clean ESM. 2022-03-09 18:38:46 +01:00
junos d4a4bbbff0 Remove unused columns. 2022-03-09 17:58:36 +01:00
junos 085a6d144b Add files to compute and create an empty script. 2022-03-09 17:32:02 +01:00
junos 42d62f16d0 Add RAPIDS mandatory columns for ESM. 2022-03-09 17:31:37 +01:00
junos 2bef86b1da Add a format for ESM and add to config. 2022-03-08 15:43:25 +01:00
junos d8e9a309f7 Rename features and write baseline_interim. 2022-03-08 15:10:36 +01:00
junos a3a4f04ffe Setting with : produces NaNs. 2022-03-01 12:02:57 +01:00
junos aedb8b6785 Write questionnaire data to data/interim. 2022-03-01 12:02:36 +01:00
junos b5a6317f4b Calculate JCQ control and demand control ratio.
Include norms and corresponding quartile.
2022-02-28 18:51:47 +01:00
junos 2fed962644 Calculate JCQ demand score.
Hardcode question IDs to be reversed.
2022-02-28 18:30:41 +01:00
junos 30ac8b1cd5 Start calculating demand control features. 2022-02-23 19:08:10 +01:00
junos 9a74e74d08 Add the baseline features rule to snakefile.
Correct age calculation for a single value instead of dataframe.
2022-02-23 18:15:26 +01:00
junos 07da6be398 Add age, gender, and language as features.
Move calculation of age from merge_baseline_data.py to baseline_features.py.
2022-02-23 18:05:23 +01:00
junos 176367631b Prepare baseline feature rule. 2022-02-23 11:09:33 +01:00
junos bf9c764c97 Split baseline data to participants.
And some csv I/O settings.
2022-02-04 18:37:57 +01:00
junos 16e608db74 First merge baseline datasets. 2022-02-04 18:21:42 +01:00
junos 204f6f50b0 Read the relevant files. 2022-02-04 18:06:02 +01:00
junos 685ed6a546 Set up demographic data download. 2022-02-04 17:37:00 +01:00
Meng Li 463ac0a2aa
Fix bug#169 (#174) 2022-01-27 11:27:32 -05:00
Sam 10e896ca1d
Add data stream for AWARE Micro server (#173)
* Add data stream for AWARE Micro server

* Fix one documentation typo and one ommission
2022-01-27 10:47:50 -05:00
junos afa3b8546f Mutate data in an R script.
The Python script did not read the timestamp correctly for some reason. All timestamps were 0.
2022-01-26 16:34:19 +01:00
Sam e5dbbfce44
Avoid NA problem in barnett location evaluation (#172)
* Avoid occasional issue where does_not_span evaluates to NA, which breaks the if()

* Restored original warning
2022-01-18 10:16:37 -05:00
Sam 8ae26fb845
Fixes issue where 'duration' in the 'ios_calls' dataframe is seen as a character type. (#171) 2022-01-18 10:15:53 -05:00
junos b17a7eff1a Deal with inexplicable snakemake failure. 2022-01-07 18:11:38 +01:00
junos e1499a5ae2 Account for missing device_ids. 2021-12-15 20:41:28 +01:00
junos 5a9252e46e Merge remote-tracking branch 'origin/master' 2021-12-15 18:32:36 +01:00
junos 352598f3da Use absolute path to avoid RuleException. 2021-12-15 17:27:13 +01:00
junos 70cada8bb8 Consider a subset of columns when dropping. 2021-12-15 16:14:33 +01:00
junos d2ed73dccf Debug ValueError for index.
See exploration/debug_heatmap.py for illustration.
2021-12-15 16:03:04 +01:00
junos 6f451e05ac Bring back application_name.
This column still needs to be in the data, so add it in app_add_name.py.
Later, join categories by package hash.
2021-12-15 12:58:27 +01:00
junos 4485c4c95e Delete columns we don't have.
Rename light table.
Correct timesegments.
2021-12-08 20:02:47 +01:00
junos 37b3460b76 Use Empatica wristband numbers as provided in CSV. 2021-12-01 17:20:57 +01:00
junos 22f9e0722d Start preparing the true usernames CSV file. 2021-12-01 11:29:22 +01:00
junos 0be4cd5a8f Remove unnecessary library. 2021-11-30 17:08:07 +01:00
junos 04ad2d0b81 Source specific container script.
It is probably not worth the effort of making this general.
2021-11-29 18:19:47 +01:00
junos da5ff0f36e Correct small errors in settings. 2021-11-29 18:04:06 +01:00
junos 35d9779026 Prepare the tibble in requested format.
Write it to a CSV file.
2021-11-29 17:54:16 +01:00
junos 32025cbd8c Start with a tibble from CSV. 2021-11-29 17:51:07 +01:00
junos 181e4f0118 Add parameters to yaml file.
And use these in the prepare_participants_file function.
2021-11-29 16:57:50 +01:00
junos ab84109d55 Prepare a function to compile participants data.
It combines functions from container.R
2021-11-24 19:07:56 +01:00
junos f9863ec622 Fix small mistakes. 2021-11-24 19:01:30 +01:00
junos c1f56c61e8 Add a function to pull start and end datetimes. 2021-11-24 18:33:06 +01:00
junos 3acf6ece14 Add a function to pull device IDs. 2021-11-24 18:23:53 +01:00
junos 8b2717122d Add a function to get participants' IDs. 2021-11-24 18:05:17 +01:00
Meng Li 5bad3eb8b5
Data cleaning (#166)
* Refactor data cleaning module: move it from example workflow to main directory

* Replace NAs with 0 in selected event-based features

* Add one step to drop highly correlated features

Co-authored-by: Weiyu <weiyuhuang7@gmail.com>
2021-11-19 10:34:36 -05:00
Meng Li 296960f425 Fix the bug of location doryab features when a participant is moving during the whole time segment 2021-11-18 18:42:19 -05:00
Meng Li 3d34036eae
Add firststeptime and laststeptime features to FITBIT_STEPS_INTRADAY RAPIDS provider (#168)
* Add firststeptime and laststeptime features to FITBIT_STEPS_INTRADAY RAPIDS provider

* Update test config files
2021-11-18 18:35:27 -05:00
junos ed193d2290 Revert "Correct the name of a field."
This reverts commit b335561a55.

It was actually correct.
2021-11-17 19:16:35 +01:00
junos b335561a55 Correct the name of a field. 2021-11-17 18:50:06 +01:00
junos fcec3e2f93 Implement the necessary functions for PSQL. 2021-11-17 18:49:25 +01:00
junos 7a1e4f7139 Add the format file copied from MySQL. 2021-11-17 18:46:16 +01:00
Meng Li f340b89c58 Temporary revert PHONE_LOCATIONS BARNETT provider to use R script 2021-09-23 18:16:13 -04:00
Meng Li a3fb718aea Refactor PHONE_LOCATIONS DORYAB provider to compute features based on location episodes 2021-09-23 17:40:06 -04:00
Meng Li a8a178486b Refactor PHONE_CALLS RAPIDS provider to compute features based on call episodes or events 2021-09-15 10:28:37 -04:00
JulioV 2e553dc9e7 Add tqdm package to environment.yaml 2021-08-16 11:04:03 -04:00
Meng Li 3ac12e7dad Fix the bug of step intraday features when INCLUDE_ZERO_STEP_ROWS is False 2021-08-11 12:40:40 -04:00
Weiyu 35eebe8a51 Bug fixed: set ratiovalidyielded mins/hours value to the range 0 to 1 2021-08-09 17:56:29 -04:00
JulioV 3e69966c91 Update error message 2021-08-04 15:33:02 -04:00
Shirley 4ddb2845a6 Update initialize_params 2021-08-04 15:33:02 -04:00
JulioV 834bd3b93d Refactor in Python of Barnett provider
Co-authored-by: Shirley Hayati <sahayati@ucdavis.edu>
Co-authored-by: JulioV <JulioV@users.noreply.github.com>
2021-08-04 15:33:02 -04:00
Weiyu 7f1c502ea0 Fixed bug: Added local_segment column if no data left after filtered 2021-08-04 11:06:31 -04:00
Hannah Roberts b52059b027 Ensure date/time format is maintained
Within the 'determine which is home' for loop, 'xx' is the midpoint of two datetime objects. When the midpoint is calculated to be midnight, only the date is returned. This can be replicated with:

mydates <- as.POSIXct("2018-01-01 00:00:00", tz = "UTC")
mydates
[1] "2018-01-01 UTC"

This results in 'hourofday' being NA as an hour cannot be found. By adding the suggested format wrapper, the time is maintained and 'hourofday' can be determined. It can then successfully be applied to the embedded if-statement within the loop.

mydates <- format(as.POSIXct("2018-01-01 00:00:00", tz = "UTC"), "%Y-%m-%d %H:%M:%S")
mydates
[1] "01-01-2018 00:00:00"
2021-07-23 10:12:11 -04:00
Weiyu 5a465873c4 Tested fitbit heartrate intraday feature 2021-07-21 10:24:02 -04:00
JulioV 6fa1875bf3 Add app foreground episode count 2021-07-01 16:20:16 -04:00
JulioV bc5c0c9a4f Fix app episode length bug 2021-07-01 16:20:16 -04:00
JulioV 065a926a87 Change own to custom categories name 2021-07-01 16:20:16 -04:00
JulioV e74c745f86 Add own categories to app foreground features 2021-07-01 16:20:16 -04:00
JulioV 5892b6d838 Fix create_participants_files.R to handle numeric PIDs 2021-07-01 16:20:16 -04:00
Meng Li 97ef8a8368 Set color range and avoid SettingWithCopyWarning 2021-06-29 09:50:19 -04:00
Meng Li 1c57320ab3 Update segment labels and fix the bug when we do not have any labels for event segments 2021-06-29 09:49:24 -04:00
Meng Li cefcb0635b Update heatmap of recorded phone sensors 2021-06-29 09:49:24 -04:00
Meng Li bc06477d89 Update heatmap of sensor row count 2021-06-29 09:49:24 -04:00
Meng Li e98a8ff7ca Update histogram of phone data yield 2021-06-29 09:49:24 -04:00
Meng Li f436f1f530 Update heatmap of correlation matrix 2021-06-29 09:49:23 -04:00
Meng Li 4d37696158 Update heatmaps of overall data yield 2021-06-29 09:48:30 -04:00
Weiyu f374c67bd5 Bug fixed: Added unknown activity case 2021-06-23 19:04:55 -04:00
Weiyu 3e4d167adc Bug fixed: sort bt_address alphabetically before picking the most frequent bt_address 2021-06-22 17:40:00 -04:00
Meng Li f248b6c97d Fix bugs of Fitbit mutation scripts 2021-06-11 18:18:33 -04:00
kirtirajk 4b8698a4c6 adding app_episode with the changes as mentioned in the comments 2021-06-10 14:17:56 -04:00
Weiyu 65d5cb7bd4 Bug fixed: countscansmostuniquedevice stays the same for all time segments 2021-06-10 10:49:22 -04:00
JulioV e123a14082 Improve aware_csv msg when CSV files don't exist 2021-06-01 10:57:17 -04:00
Meng Li 9687081fbe Refactor the rule phone_locations_add_doryab_extra_columns 2021-05-28 09:48:36 -04:00
Meng Li 0d6f51be8b Refactor location features from Doryab provider & add a new strategy to infer home location & fix bugs 2021-05-26 17:36:52 -04:00
JulioV 32472461ec - Fix bug when no phone data yield is needed to process location data
- Remove location rows with the same timestamp based on their accuracy
2021-05-26 14:04:29 -04:00
Nikunj Goel 9b21196f35
Fixed `expected_minutes` to account for different time segments. (#136) 2021-05-26 11:44:48 -04:00
Meng Li edf71e055d Add the EXCLUDE_SLEEP module for steps intraday features 2021-05-21 15:23:21 -04:00
Nikunj Goel 5e451f99b0
Added phone keyboard features including docs/tests (#134) 2021-05-21 11:45:27 -04:00
JulioV e9cd9c94d7 Fix PID matching when joining data from participants 2021-05-11 16:49:04 -04:00
JulioV 32818a4802 Fix parse of pids with more than 1 devices 2021-05-11 16:42:20 -04:00
Meng Li 809845143f Test & fix bugs of sleep intraday features 2021-04-27 14:40:14 -04:00
Meng Li 7c7f34ec45 Test & fix bugs of sleep summary features 2021-04-27 14:40:14 -04:00
Meng Li 50fe09cfac Update data streams mutation of fitbit data 2021-04-27 14:40:14 -04:00
Meng Li 66d9a9d640 Update params & docs of sleep features 2021-04-27 14:33:19 -04:00
JulioV 4beafd233d Fix crash when scraping data for an app that does not exist 2021-04-22 14:28:52 -04:00
JulioV ea8094e028 Fix length of periodic segments on days with DLS 2021-04-22 11:32:10 -04:00
JulioV 9c56422529 Add calories intraday features 2021-04-20 12:00:38 -04:00
Meng Li 00a3335623 Add device_id column for sleep intraday episodes 2021-04-08 11:21:28 -04:00
JulioV 286d317af4 Fix crash when there are no periodic segments to assign
This includes a simplification of how periodic segments are computed based on all local dates in the data independently of their time zones
2021-04-07 12:03:25 -04:00
JulioV 9551669d47 Fix periodic segments bug when there are no segments to assign 2021-04-06 20:29:30 -04:00
Meng Li 78173c54ab Convert date time object to string in assign_tz_code() function 2021-04-06 23:28:53 +00:00
JulioV 1025e6d9d8 Fix datetime labels of event segments across multiple tzs 2021-04-06 13:58:58 -04:00
Meng Li 8909876cff Add local_segment column for phone data yield features 2021-04-05 21:13:36 +00:00
Meng Li 68125dc1bf Fix the bug of phone data yield features when the input is empty 2021-04-05 20:57:05 +00:00
JulioV 46f5e24814 Fix Fitbit tz inference from phone data 2021-04-05 11:51:57 -04:00
JulioV 636b64c61a Revert "Added more keyboard features."
This reverts commit 94c72e3172.
2021-04-05 11:25:00 -04:00
Meng Li 68e12a2563 Fix bugs of bluetooth feature extraction when number of unique bt_address is 2 2021-04-05 14:09:50 +00:00
Meng Li 8414977331 Fix the bug of utils.py when one participant have multi timezones 2021-04-03 21:20:10 -04:00
nikunjgoel95 94c72e3172 Added more keyboard features. 2021-04-01 20:54:13 -04:00
Meng Li 1ea5b74eff Fix the bug of utils.py when one participant have multi timezones 2021-03-31 19:34:14 -04:00
Meng Li 136dfef56b Fix the bug of Analysis Workflows while parsing targets with updated segments 2021-03-30 16:41:50 -04:00
JulioV 99dae079d5 Add iOS BT and Wifi visible to formats for old devices 2021-03-30 15:32:50 -04:00
JulioV 30ad3cd586 Validate participant files without device ids 2021-03-28 15:29:08 -04:00
JulioV 87fbbbe402 Refactor and simplify time segments 2021-03-28 15:29:07 -04:00
JulioV c48c1c8f24 Optimize Barnett's computation multi-day segments 2021-03-28 15:29:07 -04:00
JulioV d0858f8833 Fix overlapping periodic time segments 2021-03-28 15:29:07 -04:00