Document individual functions for event extraction.

communication
junos 2021-07-27 20:56:27 +02:00
parent 9bd42afa02
commit b99136a181
1 changed files with 91 additions and 0 deletions

View File

@ -77,6 +77,21 @@ def extract_stressful_events(df_esm: pd.DataFrame) -> pd.DataFrame:
def calculate_threat_challenge_means(df_esm_sam_clean: pd.DataFrame) -> pd.DataFrame:
"""
This function calculates challenge and threat (two Stress Appraisal Measure subscales) means,
for each ESM session (within participants and devices).
It creates a grouped dataframe with means in two columns.
Parameters
----------
df_esm_sam_clean: pd.DataFrame
A cleaned up dataframe of Stress Appraisal Measure items.
Returns
-------
df_esm_event_threat_challenge_mean_wide: pd.DataFrame
A dataframe of unique ESM sessions (by participants and devices) with threat and challenge means.
"""
# Select only threat and challenge assessments for events
df_esm_event_threat_challenge = df_esm_sam_clean[
(
@ -101,6 +116,30 @@ def calculate_threat_challenge_means(df_esm_sam_clean: pd.DataFrame) -> pd.DataF
def detect_stressful_event(df_esm_sam_clean: pd.DataFrame) -> pd.DataFrame:
"""
Participants were asked: "Was there a particular event that created tension in you?"
The following options were available:
0 - No,
1 - Yes, slightly,
2 - Yes, moderately,
3 - Yes, considerably,
4 - Yes, extremely.
This function indicates whether there was a stressful event (True/False)
and how stressful it was on a scale of 1 to 4.
Parameters
----------
df_esm_sam_clean: pd.DataFrame
A cleaned up dataframe of Stress Appraisal Measure items.
Returns
-------
df_esm_event_stress: pd.DataFrame
The same dataframe with two new columns:
- event_present, indicating whether there was a stressful event at all,
- event_stressfulness, a numeric answer (1-4) to the single item question.
"""
df_esm_event_stress = df_esm_sam_clean[
df_esm_sam_clean["questionnaire_id"] == QUESTIONNAIRE_ID_SAM.get("event_stress")
]
@ -112,6 +151,21 @@ def detect_stressful_event(df_esm_sam_clean: pd.DataFrame) -> pd.DataFrame:
def detect_event_work_related(df_esm_sam_clean: pd.DataFrame) -> pd.DataFrame:
"""
This function simply adds a column indicating the answer to the question:
"Was/is this event work-related?"
Parameters
----------
df_esm_sam_clean: pd.DataFrame
A cleaned up dataframe of Stress Appraisal Measure items.
Returns
-------
df_esm_event_stress: pd.DataFrame
The same dataframe with a new column event_work_related (True/False).
"""
df_esm_event_stress = df_esm_sam_clean[
df_esm_sam_clean["questionnaire_id"]
== QUESTIONNAIRE_ID_SAM.get("event_work_related")
@ -123,6 +177,22 @@ def detect_event_work_related(df_esm_sam_clean: pd.DataFrame) -> pd.DataFrame:
def convert_event_time(df_esm_sam_clean: pd.DataFrame) -> pd.DataFrame:
"""
This function only serves to convert the string datetime answer into a real datetime type.
Errors during this conversion are coerced, meaning that non-datetime answers are assigned Not a Time (NaT).
NOTE: Since the only available non-datetime answer to this question was "0 - I do not remember",
the NaTs can be interpreted to mean this.
Parameters
----------
df_esm_sam_clean: pd.DataFrame
A cleaned up dataframe of Stress Appraisal Measure items.
Returns
-------
df_esm_event_time: pd.DataFrame
The same dataframe with a new column event_time of datetime type.
"""
df_esm_event_time = df_esm_sam_clean[
df_esm_sam_clean["questionnaire_id"] == QUESTIONNAIRE_ID_SAM.get("event_time")
].assign(
@ -134,6 +204,27 @@ def convert_event_time(df_esm_sam_clean: pd.DataFrame) -> pd.DataFrame:
def extract_event_duration(df_esm_sam_clean: pd.DataFrame) -> pd.DataFrame:
"""
If participants indicated a stressful events, they were asked:
"How long did this event last? (Answer in hours and minutes)"
This function extracts this duration time and saves additional answers:
0 - I do not remember,
1 - It is still going on.
Parameters
----------
df_esm_sam_clean: pd.DataFrame
A cleaned up dataframe of Stress Appraisal Measure items.
Returns
-------
df_esm_event_duration: pd.DataFrame
The same dataframe with two new columns:
- event_duration, a time part of a datetime,
- event_duration_info, giving other options to this question:
0 - I do not remember,
1 - It is still going on
"""
df_esm_event_duration = df_esm_sam_clean[
df_esm_sam_clean["questionnaire_id"]
== QUESTIONNAIRE_ID_SAM.get("event_duration")