Deployed e1cfcd46
to dev with MkDocs 1.2.1 and mike 1.0.1
parent
f5353b360c
commit
a245ad1c4c
File diff suppressed because one or more lines are too long
Binary file not shown.
|
@ -1888,7 +1888,7 @@
|
||||||
</ol>
|
</ol>
|
||||||
<p>Note you will see a lot of warning messages, you can ignore them since they happen because we ran ML algorithms with a small fake dataset.</p>
|
<p>Note you will see a lot of warning messages, you can ignore them since they happen because we ran ML algorithms with a small fake dataset.</p>
|
||||||
<h2 id="modules-of-our-analysis-workflow-example">Modules of our analysis workflow example<a class="headerlink" href="#modules-of-our-analysis-workflow-example" title="Permanent link">¶</a></h2>
|
<h2 id="modules-of-our-analysis-workflow-example">Modules of our analysis workflow example<a class="headerlink" href="#modules-of-our-analysis-workflow-example" title="Permanent link">¶</a></h2>
|
||||||
<details class="info"><summary>1. Feature extraction</summary><p>We extract daily behavioral features for data yield, received and sent messages, missed, incoming and outgoing calls, resample fused location data using Doryab provider, activity recognition, battery, Bluetooth, screen, light, applications foreground, conversations, Wi-Fi connected, Wi-Fi visible, Fitbit heart rate summary and intraday data, Fitbit sleep summary data, and Fitbit step summary and intraday data without excluding sleep periods with an active bout threshold of 10 steps. In total, we obtained 237 daily sensor features over 12 days per participant. </p>
|
<details class="info"><summary>1. Feature extraction</summary><p>We extract daily behavioral features for data yield, received and sent messages, missed, incoming and outgoing calls, resample fused location data using Doryab provider, activity recognition, battery, Bluetooth, screen, light, applications foreground, conversations, Wi-Fi connected, Wi-Fi visible, Fitbit heart rate summary and intraday data, Fitbit sleep summary data, and Fitbit step summary and intraday data without excluding sleep periods with an active bout threshold of 10 steps. In total, we obtained 245 daily sensor features over 12 days per participant. </p>
|
||||||
</details>
|
</details>
|
||||||
<details class="info"><summary>2. Extract demographic data.</summary><p>It is common to have demographic data in addition to mobile and target (ground truth) data. In this example we include participants’ age, gender and the number of days they spent in hospital after their surgery as features in our model. We extract these three columns from the <code>data/external/example_workflow/participant_info.csv</code> file. As these three features remain the same within participants, they are used only on the population model. Refer to the <code>demographic_features</code> rule in <code>rules/models.smk</code>.</p>
|
<details class="info"><summary>2. Extract demographic data.</summary><p>It is common to have demographic data in addition to mobile and target (ground truth) data. In this example we include participants’ age, gender and the number of days they spent in hospital after their surgery as features in our model. We extract these three columns from the <code>data/external/example_workflow/participant_info.csv</code> file. As these three features remain the same within participants, they are used only on the population model. Refer to the <code>demographic_features</code> rule in <code>rules/models.smk</code>.</p>
|
||||||
</details>
|
</details>
|
||||||
|
@ -1899,7 +1899,7 @@
|
||||||
<details class="info"><summary>5. Data visualization.</summary><p>At this point the user can use the five plots RAPIDS provides (or implement new ones) to explore and understand the quality of the raw data and extracted features and decide what sensors, days, or participants to include and exclude. Refer to <code>rules/reports.smk</code> to find the rules that generate these plots.</p>
|
<details class="info"><summary>5. Data visualization.</summary><p>At this point the user can use the five plots RAPIDS provides (or implement new ones) to explore and understand the quality of the raw data and extracted features and decide what sensors, days, or participants to include and exclude. Refer to <code>rules/reports.smk</code> to find the rules that generate these plots.</p>
|
||||||
</details>
|
</details>
|
||||||
<details class="info"><summary>6. Feature cleaning.</summary><p>In this stage we perform four steps to clean our sensor feature file. First, we discard days with a data yield hour ratio less than or equal to 0.75, i.e. we include days with at least 18 hours of data. Second, we drop columns (features) with more than 30% of missing rows. Third, we drop columns with zero variance. Fourth, we drop rows (days) with more than 30% of missing columns (features). In this cleaning stage several parameters are created and exposed in <code>example_profile/example_config.yaml</code>. </p>
|
<details class="info"><summary>6. Feature cleaning.</summary><p>In this stage we perform four steps to clean our sensor feature file. First, we discard days with a data yield hour ratio less than or equal to 0.75, i.e. we include days with at least 18 hours of data. Second, we drop columns (features) with more than 30% of missing rows. Third, we drop columns with zero variance. Fourth, we drop rows (days) with more than 30% of missing columns (features). In this cleaning stage several parameters are created and exposed in <code>example_profile/example_config.yaml</code>. </p>
|
||||||
<p>After this step, we kept 161 features over 11 days for the individual model of p01, 101 features over 12 days for the individual model of p02 and 109 features over 20 days for the population model. Note that the difference in the number of features between p01 and p02 is mostly due to iOS restrictions that stops researchers from collecting the same number of sensors than in Android phones. </p>
|
<p>After this step, we kept 163 features over 11 days for the individual model of p01, 101 features over 12 days for the individual model of p02 and 109 features over 20 days for the population model. Note that the difference in the number of features between p01 and p02 is mostly due to iOS restrictions that stops researchers from collecting the same number of sensors than in Android phones. </p>
|
||||||
<p>Feature cleaning for the individual models is done in the <code>clean_sensor_features_for_individual_participants</code> rule and for the population model in the <code>clean_sensor_features_for_all_participants</code> rule in <code>rules/models.smk</code>.</p>
|
<p>Feature cleaning for the individual models is done in the <code>clean_sensor_features_for_individual_participants</code> rule and for the population model in the <code>clean_sensor_features_for_all_participants</code> rule in <code>rules/models.smk</code>.</p>
|
||||||
</details>
|
</details>
|
||||||
<details class="info"><summary>7. Merge features and targets.</summary><p>In this step we merge the cleaned features and target labels for our individual models in the <code>merge_features_and_targets_for_individual_model</code> rule in <code>rules/models.smk</code>. Additionally, we merge the cleaned features, target labels, and demographic features of our two participants for the population model in the <code>merge_features_and_targets_for_population_model</code> rule in <code>rules/models.smk</code>. These two merged files are the input for our individual and population models. </p>
|
<details class="info"><summary>7. Merge features and targets.</summary><p>In this step we merge the cleaned features and target labels for our individual models in the <code>merge_features_and_targets_for_individual_model</code> rule in <code>rules/models.smk</code>. Additionally, we merge the cleaned features, target labels, and demographic features of our two participants for the population model in the <code>merge_features_and_targets_for_population_model</code> rule in <code>rules/models.smk</code>. These two merged files are the input for our individual and population models. </p>
|
||||||
|
|
Loading…
Reference in New Issue