Extracting Classification Features from Physiological Signals

Open Live Script

This example shows how to use the functions midcross and dtw to extract features from gait signal data. Gait signals are used to study the walking patterns of patients with neurodegenerative disease. The time between strides has been reported to differ between healthy and sick individuals. midcross offers a convenient way to calculate these times. People also change their walking speed over time. dtw provides a convenient way to quantitatively compare the shape of gait signals by warping to align them in time. This example uses midcross to locate each step in a gait signal and dtw to compute distances between gait signal segments. These results are examined as potential features for signal classification. While this example is specific to gait signals, other physiological signals, such as electrocardiogram (ECG) or photoplethysmogram (PPG), can also be analyzed using these functions.

Measure Inter-Stride Time Intervals

The dataset being analyzed contains force data collected during walking for patients with Amyotrophic Lateral Sclerosis (ALS) and a control group. ALS is a disease made famous by Lou Gehrig, Stephen Hawking, and the 2014 'Ice Bucket Challenge'.

Load and plot the first 30 seconds of gait signal data for one patient.

helperGaitPlot('als1m');
xlim([0 30])

This dataset represents the force exerted by a foot on a force sensitive resistor. The force is measured in millivolts. Each record is one minute in length and contains separate channels for the left and right foot of a subject. Each step in the dataset is characterized by a sharp change in force as the foot impacts and leaves the ground. Use midcross to find these sharp changes for an ALS patient.

Use midcross to find and plot the location of each crossing for the left foot of an ALS patient. Chose a tolerance of 25% to ensure that every crossing is detected.

Fs = 300;
gaitSignal = helperGaitImport('als1m');
midcross(gaitSignal(1,:),Fs,'tolerance',25);
xlim([0 30])
xlabel('Sample Number')
ylabel('mV')

midcross correctly identifies the crossings. Now use it to calculate the inter-stride times for a group of ten patients. Five patients are control subjects, and five patients have ALS. Use the left foot record for each patient and exclude the first eight crossings to remove transients.

pnames = helperGaitImport();
for i = 1:10
  gaitSignal = helperGaitImport(pnames{i});
  IND2 = midcross(gaitSignal(1,:),Fs,'Tolerance',25);
  IST{i} = diff(IND2(9:2:end));   
  varIST(i) = var(IST{i});
end

Plot the inter-stride times.

figure
hold on
for i = 1:5
  plot(1:length(IST{i}),IST{i},'.-r')
  plot(1:length(IST{i+5}),IST{i+5},'.-b')
end
xlabel('Stride Number')
ylabel('Time Between Strides (sec)')
legend('ALS','Control')

The variance of the inter-stride times is higher overall for the ALS patients.

Measure Similarity of Walking Patterns

Having quantified the distance between steps, proceed to analyze the shape of the gait signal data independent of these inter-step variations. Compare two segments of the signal using dtw. Ideally, one would compare the shape of the gait signal over time as treatment or disease progresses. Here, we compare two segments of the same record, one segment taken early in the recording (sigsInitialLeft), and the second towards the end (sigsFinalLeft). Each segment contains six steps.

Load the gait signal data segments.

load PNGaitSegments.mat

The patient does not walk at the same rate throughout the record. dtw provides a measure of the distance between segments by warping them to align them in time. Compare the two segments using dtw.

figure
dtw(sigsInitialLeft{1},sigsFinalLeft{1});
legend('Early segment','Later segment','location','southeast')

The two segments are aligned in time. Although the step rate of the patient appears to change over time, as can be seen in the offset of the original signals, dtw matches the two segments by allowing samples of either segment to repeat. The distance via dtw, along with the variance of the inter-stride times, will be explored as features for a gait signal classifier.

Construct a Feature Vector to Classify Signals

Suppose you are building a classifier to decide whether or not a patient is healthy based on a gait signals. Investigate the variance of inter-stride times, feature1, and the distance via dtw between initial and final signal segments, feature2, as classification features.

Feature 1 was previously computed using midcross.

feature1 = varIST;

Extract Feature 2 for the ALS patients and the control group.

feature2 = zeros(10,1);
for i = 1:length(sigsInitialLeft)
  feature2(i) = dtw(sigsInitialLeft{i},sigsFinalLeft{i});
end

Plot the features for ALS subjects and control subjects.

figure
plot(feature1(1:5),feature2(1:5),'r*',...
    feature1(6:10),feature2(6:10),'b+',...
    'MarkerSize',10,'LineWidth',1)
xlabel('Variance of Inter-Stride Times')
ylabel('Distance Between Segments')
legend('ALS','Control')

ALS patients seem to have a larger variance in their inter-stride times, but a smaller distance via dtw between segments. These features compliment each other and can be explored for use in a classifier such as a Neural Network or Support Vector Machine.

Conclusions

midcross and dtw provide a convenient way to compare gait signals and other physiological data which repeat irregularly over time due to different rates of motion or activity. In this example, step times were located using midcross and segment distances were computed using dtw. These were complimentary measures, as dtw removed any time variation that midcross distances would measure. As features, these two metrics showed separation between control and ALS patients for this dataset. midcross and dtw could likewise be used to examine other physiological signals whose shape varies as a function of activity.

References

[1] Goldberger, A. L., L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. Ch. Ivanov, R. G. Mark, R. G. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley. "PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals." Circulation. Vol. 101, Number 23, 2000, pp. e215-e200.