207 lines
11 KiB
Markdown
207 lines
11 KiB
Markdown
# Calculating features
|
|
|
|
## Usage
|
|
- Install the library with:
|
|
```
|
|
pip install pep517
|
|
python -m pep517.build .
|
|
|
|
Alternative:
|
|
pip install build
|
|
python -m build
|
|
```
|
|
- Basic usage is:
|
|
```
|
|
from calculatingfeatures.CalculatingFeatures.helper_functions import convert1DEmpaticaToArray, convertInputInto2d, frequencyFeatureNames, hrvFeatureNames
|
|
from calculatingfeatures.CalculatingFeatures.calculate_features import calculateFeatures
|
|
import pandas as pd
|
|
|
|
pathToHrvCsv = "example_data/S2_E4_Data/BVP.csv"
|
|
windowLength = 500
|
|
|
|
# get an array of values from HRV empatica file
|
|
hrv_data, startTimeStamp, sampleRate = convert1DEmpaticaToArray(pathToHrvCsv)
|
|
|
|
# Convert the HRV data into 2D array
|
|
hrv_data_2D = convertInputInto2d(hrv_data, windowLength)
|
|
|
|
# Create a list with feature names
|
|
featureNames = []
|
|
featureNames.extend(hrvFeatureNames)
|
|
featureNames.extend(frequencyFeatureNames)
|
|
|
|
pd.set_option('display.max_columns', None)
|
|
|
|
# Calculate features
|
|
calculatedFeatures = calculateFeatures(hrv_data_2D, fs=int(sampleRate), featureNames=featureNames)
|
|
```
|
|
- More usage examples are located in **usage_examples.ipynb** file
|
|
|
|
## Features
|
|
- Features are returned (from calculateFeatures() function) in a Pandas DataFrame object.
|
|
- In the case if a feature couldn't be calculated (for example, if input signal is invalid), NaN value is returned.
|
|
- Further in this section, the list with descriptions of all possible features is presented.
|
|
|
|
### GSR features:
|
|
These features are useful for 1D GSR(EDA) signals
|
|
- `mean`: mean of the signal
|
|
- `std`: standard deviation of signal
|
|
- `q25`: 0.25 quantile
|
|
- `q75`: 0.75 quantile
|
|
- `qd`: q75 - q25
|
|
- `deriv`: sum of gradients of the signal
|
|
- `power`: power of the signal (mean of squared signal)
|
|
- `numPeaks`: number of EDA peaks
|
|
- `ratePeaks`: average number of peaks per second
|
|
- `powerPeaks`: power of peaks (mean of signal at indexes of peaks)
|
|
- `sumPosDeriv`: sum of positive derivatives divided by number of all derivatives
|
|
- `propPosDeriv`: proportion of positive derivatives per all derivatives
|
|
- `derivTonic`: sum of gradients of the tonic
|
|
- `sigTonicDifference`: mean of tonic subtracted from signal
|
|
- `freqFeats`:
|
|
- `maxPeakAmplitudeChangeBefore`: maximum peak amplitude change before peak
|
|
- `maxPeakAmplitudeChangeAfter`: maximum peak amplitude change after peak
|
|
- `avgPeakAmplitudeChangeBefore`: average peak amplitude change before peak
|
|
- `avgPeakAmplitudeChangeAfter`: average peak amplitude change after peak
|
|
- `avgPeakChangeRatio`: avg_peak_increase_time / avg_peak_decrease_time
|
|
- `maxPeakIncreaseTime`: maximum peak increase time
|
|
- `maxPeakDecreaseTime`: maximum peak decrease time
|
|
- `maxPeakDuration`: maximum peak duration
|
|
- `maxPeakChangeRatio`: max_peak_increase_time / max_peak_decrease_time
|
|
- `avgPeakIncreaseTime`: average peak increase time
|
|
- `avgPeakDecreaseTime`: average peak decreade time
|
|
- `avgPeakDuration`: average peak duration
|
|
- `maxPeakResponseSlopeBefore`: maximum peak response slope before peak
|
|
- `maxPeakResponseSlopeAfter`: maximum peak response slope after peak
|
|
- `signalOverallChange`: maximum difference between samples (max(sig)-min(sig))
|
|
- `changeDuration`: duration between maximum and minimum values
|
|
- `changeRate`: change_duration / signal_overall_change
|
|
- `significantIncrease`:
|
|
- `significantDecrease`:
|
|
|
|
### HRV features:
|
|
These features are useful for 1D HRV(BVP) signals.
|
|
|
|
If number of RR intervals (numRR) is less than `length of sample / (2 * sampling rate)` (30 BPM) or greater than `length of sample / (sampling rate / 4)` (240 BPM), BPM value is incorrect and thus, all other HRV features are set to NaN.
|
|
|
|
- `meanHr`: mean heart rate
|
|
- `ibi`: mean interbeat interval
|
|
- `sdnn`: standard deviation of the ibi
|
|
- `sdsd`: standard deviation of the differences between all subsequent R-R intervals
|
|
- `rmssd`: root of the mean of the list of squared differences
|
|
- `pnn20`: the proportion of NN20 intervals to all intervals
|
|
- `pnn50`: the proportion of NN50 intervals to all intervals
|
|
- `sd`:
|
|
- `sd2`:
|
|
- `sd1/sd2`: sd / sd2 ratio
|
|
- `numRR`: number of RR intervals
|
|
|
|
### Accelerometer features:
|
|
These features are useful for 3D signals from accelerometer
|
|
- `meanLow`: mean of low-pass filtered signal
|
|
- `areaLow`: area under the low-pass filtered signal
|
|
- `totalAbsoluteAreaBand`: sum of absolute areas under the band-pass filtered x, y and z signal
|
|
- `totalMagnitudeBand`: square root of sum of squared band-pass filtered x, y and z components
|
|
- `entropyBand`: entropy of band-pass filtered signal
|
|
- `skewnessBand`: skewness of band-pass filtered signal
|
|
- `kurtosisBand`: kurtosis of band-pass filtered signal
|
|
- `postureDistanceLow`: calculates difference between mean values for a given sensor (low-pass filtered)
|
|
- `absoluteMeanBand`: mean of band-pass filtered signal
|
|
- `absoluteAreaBand`: area under the band-pass filtered signal
|
|
- `quartilesBand`: quartiles of band-pass filtered signal
|
|
- `interQuartileRangeBand`: inter quartile range of band-pass filtered signal
|
|
- `varianceBand`: variance of band-pass filtered signal
|
|
- `coefficientOfVariationBand`: dispersion of band-pass filtered signal
|
|
- `amplitudeBand`: difference between maximum and minimum sample of band-pass filtered signal
|
|
- `totalEnergyBand`: total magnitude of band-pass filtered signal
|
|
- `dominantFrequencyEnergyBand`: ratio of energy in dominant frequency
|
|
- `meanCrossingRateBand`: the number of signal crossings with mean of band-pass filtered signal
|
|
- `correlationBand`: Pearson's correlation between band-pass filtered axis
|
|
- `quartilesMagnitudesBand`: quartiles at 25%, 50% and 75% per band-pass filtered signal
|
|
- `interQuartileRangeMagnitudesBand`: interquartile range of band-pass filtered signal
|
|
- `areaUnderAccelerationMagnitude`: area under acceleration magnitude
|
|
- `peaksDataLow`: number of peaks, sum of peak values, peak avg, amplitude avg
|
|
- `sumPerComponentBand`: sum per component of band-pass filtered signal
|
|
- `velocityBand`: velocity of the band-pass filtered signal
|
|
- `meanKineticEnergyBand`: mean kinetic energy 1/2*mV^2 of band-pass filtered signal
|
|
- `totalKineticEnergyBand`: total kinetic energy 1/2*mV^2 for all axes (band-pass filtered)
|
|
- `squareSumOfComponent`: squared sum of component
|
|
- `sumOfSquareComponents`: sum of squared components
|
|
- `averageVectorLength`: mean of magnitude vector
|
|
- `averageVectorLengthPower`: square mean of magnitude vector
|
|
- `rollAvgLow`: maximum difference of low-pass filtered roll samples
|
|
- `pitchAvgLow`: maximum difference of low-pass filtered pitch samples
|
|
- `rollStdDevLow`: standard deviation of roll (calculated from low-pass filtered signal)
|
|
- `pitchStdDevLow`: standard deviation of pitch (calculated from low-pass filtered signal)
|
|
- `rollMotionAmountLow`: amount of wrist roll (from low-pass filtered signal) motion
|
|
- `rollMotionRegularityLow`: regularity of wrist roll motion
|
|
- `manipulationLow`: manipulation of low-pass filtered signals
|
|
- `rollPeaks`: number of roll peaks, sum of roll peak values, roll peak avg, roll amplitude avg
|
|
- `pitchPeaks`: number of pitch peaks, sum of pitch peak values, pitch peak avg, pitch amplitude avg
|
|
- `rollPitchCorrelation`: correlation between roll and peak (obtained from low-pass filtered signal)
|
|
|
|
### Gyroscope features:
|
|
These features are useful for 3D signals from gyroscope
|
|
- `meanLow`: mean of low-pass filtered signal
|
|
- `areaLow`: area under the low-pass filtered signal
|
|
- `totalAbsoluteAreaLow`: sum of absolute areas under the low-pass filtered x, y and z signal
|
|
- `totalMagnitudeLow`: square root of sum of squared band-pass filtered x, y and z components
|
|
- `entropyLow`: entropy of low-pass filtered signal
|
|
- `skewnessLow`: skewness of low-pass filtered signal
|
|
- `kurtosisLow`: kurtosis of low-pass filtered signal
|
|
- `quartilesLow`: quartiles of low-pass filtered signal
|
|
- `interQuartileRangeLow`: inter quartile range of low-pass filtered signal
|
|
- `varianceLow`: variance of low-pass filtered signal
|
|
- `coefficientOfVariationLow`: dispersion of low-pass filtered signal
|
|
- `amplitudeLow`: difference between maximum and minimum sample of low-pass filtered signal
|
|
- `totalEnergyLow`: total magnitude of low-pass filtered signal
|
|
- `dominantFrequencyEnergyLow`: ratio of energy in dominant frequency
|
|
- `meanCrossingRateLow`: the number of signal crossings with mean of low-pass filtered signal
|
|
- `correlationLow`: Pearson's correlation between low-pass filtered axis
|
|
- `quartilesMagnitudeLow`: quartiles at 25%, 50% and 75% per low-pass filtered signal
|
|
- `interQuartileRangeMagnitudesLow`: interquartile range of band-pass filtered signal
|
|
- `areaUnderMagnitude`: area under magnitude
|
|
- `peaksCountLow`: number of peaks in low-pass filtered signal
|
|
- `averageVectorLengthLow`: mean of low-pass filtered magnitude vector
|
|
- `averageVectorLengthPowerLow`: square mean of low-pass filtered magnitude vector
|
|
|
|
|
|
### Generic features:
|
|
These are generic features, useful for many different types of signals
|
|
- `autocorrelations`: autocorrelations of the given signal with lags 5, 10, 20, 30, 50, 75 and 100
|
|
- `countAboveMean`: number of values in signal that are higher than the mean of signal
|
|
- `countBelowMean`: number of values in signal that are lower than the mean of signal
|
|
- `maximum`: maximum value of the signal
|
|
- `minimum`: minimum value of the signal
|
|
- `meanAbsChange`: the mean of absolute differences between subsequent time series values
|
|
- `longestStrikeAboveMean`: longest part of signal above mean
|
|
- `longestStrikeBelowMean`: longest part of signal below mean
|
|
- `stdDev`: standard deviation of the signal
|
|
- `median`: median of the signal
|
|
- `meanChange`: the mean over the differences between subsequent time series values
|
|
- `numberOfZeroCrossings`: number of crossings of signal on 0
|
|
- `absEnergy`: the absolute energy of the time series which is the sum over the squared values
|
|
- `linearTrendSlope`: a linear least-squares regression for the values of the time series versus the sequence from 0 to length of the time series minus one
|
|
- `ratioBeyondRSigma`: ratio of values that are more than r*std(x) (so r sigma) away from the mean of signal. r in this case is 2.5
|
|
- `binnedEntropy`: entropy of binned values
|
|
- `numOfPeaksAutocorr`: number of peaks of autocorrelations
|
|
- `numberOfZeroCrossingsAutocorr`: number of crossings of autocorrelations on 0
|
|
- `areaAutocorr`: area under autocorrelations
|
|
- `calcMeanCrossingRateAutocorr`: the number of autocorrelation crossings with mean
|
|
- `countAboveMeanAutocorr`: umber of values in signal that are higher than the mean of autocorrelation
|
|
- `sumPer`: sum per component
|
|
- `sumSquared`: squared sum per component
|
|
- `squareSumOfComponent`: square sum of component
|
|
- `sumOfSquareComponents`:sum of square components
|
|
|
|
### Frequency features:
|
|
These are frequency features, useful for many different types of signals. The signal is converted to power spectral density signal and features are calculated on this signal
|
|
- `fqHighestPeakFreqs`: three frequencies corresponding to the largest peaks added to features
|
|
- `fqHighestPeaks`: three largest peaks added to features
|
|
- `fqEnergyFeat`: energy calculated as the sum of the squared FFT component magnitudes, and normalized
|
|
- `fqEntropyFeat`: entropy of the FFT of the signal
|
|
- `fqHistogramBins`: Binned distribution (histogram)
|
|
- `fqAbsMean`: absolute mean of the raw signal
|
|
- `fqSkewness`: skewness of the power spectrum of the data
|
|
- `fqKurtosis`: kurtosis of the power spectrum of the data
|
|
- `fqInterquart`: inter quartile range of the raw signal |