Commit Graph

30 Commits (74cf4ada1c20c093648fd9cf5061640c456e5a18)

Author SHA1 Message Date
Meng Li 5bad3eb8b5
Data cleaning (#166)
* Refactor data cleaning module: move it from example workflow to main directory

* Replace NAs with 0 in selected event-based features

* Add one step to drop highly correlated features

Co-authored-by: Weiyu <weiyuhuang7@gmail.com>
2021-11-19 10:34:36 -05:00
Meng Li a3fb718aea Refactor PHONE_LOCATIONS DORYAB provider to compute features based on location episodes 2021-09-23 17:40:06 -04:00
Meng Li 1c57320ab3 Update segment labels and fix the bug when we do not have any labels for event segments 2021-06-29 09:49:24 -04:00
Meng Li 809845143f Test & fix bugs of sleep intraday features 2021-04-27 14:40:14 -04:00
Meng Li 8414977331 Fix the bug of utils.py when one participant have multi timezones 2021-04-03 21:20:10 -04:00
Meng Li 1ea5b74eff Fix the bug of utils.py when one participant have multi timezones 2021-03-31 19:34:14 -04:00
JulioV d0858f8833 Fix overlapping periodic time segments 2021-03-28 15:29:07 -04:00
JulioV 4528ab3641 Replace SRC LANGUAGE and FOLDER with SCRIPT 2021-03-14 22:14:13 -04:00
JulioV fb054b539f Add support for multiple time zones 2021-03-11 14:35:34 -05:00
Meng Li 8377c12efb Add sleep intraday features with RAPIDS provider 2021-02-26 17:47:01 -05:00
Meng Li 25a3492eba Drop rows without "assigned_segments" column before feature extraction 2021-01-21 19:41:17 -05:00
Meng Li 797de54b34 Fix merge bug of fetch_provider_features() function 2021-01-21 14:58:31 -05:00
JulioV 05627296f4 Fix filter_data_by_segment bug 2020-12-12 17:10:59 -05:00
JulioV 614e759551 Refactor day segments to time segments 2020-12-02 18:41:03 -05:00
Meng Li 016bdbfe8c Update Python feature scripts to add sensor and provider names automatically 2020-11-30 14:42:19 -05:00
Meng Li d3241c79f1 Update filter_data_by_segment() function: call chunk_episodes() inside the filter function 2020-11-19 17:27:53 -05:00
Meng Li 25e1f1fbb5 Update Python chunk_episodes 2020-10-26 18:47:57 -04:00
JulioV c78ccfced7 Remove microseconds from chuncked date times 2020-10-26 15:03:31 -04:00
Meng Li 2b7bd0ae6e Modify output format of chunk_episodes() function 2020-10-26 13:00:53 -04:00
Meng Li 8c0f6a000d Fix chunk_episodes() bugs: set segment_start_timestamp as int 2020-10-19 19:36:26 -04:00
JulioV 24bf62a7ab Update file names 2020-10-19 15:07:12 -04:00
Meng Li 236b1cd809 Update AR module for segments; Refactor input format 2020-10-07 18:11:06 -04:00
Meng Li bccc9a0697 Move deduplicate_episodes() function into chunk_episodes() function; rename "time_diff" with duration 2020-09-29 18:05:25 -04:00
Meng Li f1717e59e7 Update screen&battery episodes features with different segment format 2020-09-29 17:13:34 -04:00
JulioV a6b99259f7 Fix bug when filtering by day segment and there are no rows belonging to that segment 2020-09-28 15:53:38 -04:00
JulioV 9e15f46fc3 Update day segment format 2020-09-28 11:38:47 -04:00
Meng Li f806cb44ac Fix the bug of screen duration features for different segments 2020-09-18 20:25:29 -04:00
JulioV 681a77f23c Clean utils.py 2020-09-01 12:02:31 -04:00
JulioV 011b9736d5 Refactor the function to fetch provider features 2020-08-28 17:40:23 -04:00
JulioV b0f1477d7e Migrate location providers to new file structure and segments 2020-08-28 13:53:00 -04:00