Primoz
cb351e0ff6
Unnecessary line (rows with no target value will be removed in cleaning script).
2022-09-01 10:06:57 +00:00
Primoz
f78aa3e7b3
Preparation for cleaning & imputation
2022-08-26 10:56:14 +00:00
Primoz
c498ecb742
Include baseline models (+corrections), disable columns drop in cleaning function.
2022-08-23 14:12:14 +00:00
junos
a6a37c7bd9
Drop NaN targets.
...
This mirrors INNER join in merge_features_and_targets_for_individual_model.py:
data = pd.concat([sensor_features, targets[["target"]]], axis=1, join="inner")
2022-04-12 17:01:49 +02:00
junos
9f5edf1c2b
Revert "Add a rule for model baselines."
...
The example was for a classification rather than regression problem.
This reverts commit 9ab0c8f289
.
# Conflicts:
# rules/models.smk
2022-04-12 16:59:42 +02:00
junos
9ab0c8f289
Add a rule for model baselines.
...
Add baselines and helper functions to main models dir.
2022-04-12 14:23:58 +02:00
junos
f5688f6154
Add a rule to merge sensor and baseline features.
...
And select target as before.
2022-04-08 15:42:04 +02:00
junos
b1f356c3f7
Extract a function to be used elsewhere.
2022-04-08 15:36:32 +02:00
junos
7ff3dcf5fc
Move and rename target variable.
2022-04-06 18:21:09 +02:00
junos
50c0defca7
Select target columns (no parsing necessary).
2022-04-06 18:16:49 +02:00
junos
ac86221662
[WIP] Add a rule to parse targets.
...
Does nothing for now.
2022-04-06 17:47:03 +02:00
Meng Li
5bad3eb8b5
Data cleaning ( #166 )
...
* Refactor data cleaning module: move it from example workflow to main directory
* Replace NAs with 0 in selected event-based features
* Add one step to drop highly correlated features
Co-authored-by: Weiyu <weiyuhuang7@gmail.com>
2021-11-19 10:34:36 -05:00
Meng Li
136dfef56b
Fix the bug of Analysis Workflows while parsing targets with updated segments
2021-03-30 16:41:50 -04:00
JulioV
614e759551
Refactor day segments to time segments
2020-12-02 18:41:03 -05:00
Meng Li
5178be585d
Rename modeling.py to modelling.py & Update example_config.yaml
2020-11-25 22:35:38 -05:00
Meng Li
b4a512faf3
Add analysis example workflow
2020-11-25 16:34:05 -05:00
JulioV
86509207ac
Turn off warnings for tidyverse and dplyr
2020-10-23 10:41:00 -04:00
Meng Li
b3fbc79195
Minor fix of metrics section
2020-08-17 17:27:58 -04:00
Meng Li
00385dc54d
Discard useless parameters and related code of example
2020-08-17 10:48:15 -04:00
Meng Li
123b78d438
Fix minor bugs of modeling.py: f1-macro and proba
2020-08-16 16:08:51 -04:00
Meng Li
18d220d6c0
Fix data cleaning bug: days threshold
2020-08-06 17:26:37 -04:00
Meng Li
973e1669fa
Move baselines into a folder; rename column of "num_of_participants" with "num_of_rows" in modeling.py
2020-08-05 07:51:35 -04:00
Meng Li
e13b89b125
Add restore_sql_file rule; notsummarised module; diff platforms for heatmap_days_by_sensors
2020-08-03 13:09:16 -04:00
Meng Li
93157db210
Data cleaning section: replace "day_type" with "day_idx"
2020-07-27 18:27:36 -04:00
Meng Li
34ffe4abaf
Add the rule to merge population model results
2020-05-15 18:49:14 -04:00
Meng Li
8df8a5c2b3
Add baseline
2020-05-15 18:45:45 -04:00
Meng Li
8c8378f74a
Split modeling module into two rules; Add RandomOverSampler for resampling; Add log; Fix bug of AUC
2020-05-15 18:42:03 -04:00
Meng Li
5fab99d8df
Add one rule to calculate the ratio of cells with missing values for cleaned features
2020-05-15 18:25:07 -04:00
Meng Li
78cd8159be
Fix bug of clean_features_for_model
2020-05-15 17:53:43 -04:00
JulioV
f0674122ff
Replace packrat with renv
2020-05-01 19:46:04 -04:00
Meng Li
6ed52e7d1a
Change the method of computing missing value cells
2020-04-30 15:40:55 -04:00
Meng Li
7cbb227214
Add modeling module
2020-04-29 18:53:54 -04:00
Meng Li
9ddb50ed59
Split days threshold of data cleaning into days_before_surgery and days_after_discharge
...
Co-authored-by: JulioV <juliovhz@gmail.com>
2020-04-29 14:37:40 -04:00
Meng Li
5696b4f6d4
Add merge module for demographic features and target
2020-04-16 14:20:16 -04:00
Meng Li
eac721de84
Add demographic_features and targets module; refactor analysis code
...
Co-authored-by: JulioV <juliovhz@gmail.com>
2020-04-16 12:38:28 -04:00
JulioV
cdd10578b3
Ignore NA values for dropping zero variance columns
2020-04-13 16:59:33 -04:00
Meng Li
ac9fb487a6
Fix bug of select_days_to_analyse.py
2020-03-31 14:51:08 -04:00
Meng Li
ea46df63d5
Get targets
...
Co-authored-by: JulioV <juliovhz@gmail.com>
2020-03-26 17:27:23 -04:00
JulioV
0e173872df
Refactor select_days_to_analyse, fix merge bugs, add clean metrics for model
2020-03-17 21:15:53 -04:00
Meng Li
7c240a9613
Select days to analyse
...
Co-authored-by: JulioV <juliovhz@gmail.com>
2020-03-17 17:26:30 -04:00
JulioV
3d4c26754e
Rename merge metrics for models and add filter valid sensed days
...
Co-authored-by: Meng Li <AnnieLM1996@gmail.com>
2020-03-12 17:31:46 -04:00
Meng Li
aba9f13332
Add merge metrics module for analysis rules
2020-03-09 13:32:14 -04:00
JulioV
41c233e4ed
First commit
2019-10-22 13:11:01 -04:00