Is there a way to use a median style filter to reduce spikes over a certain amplitude?

14 views (last 30 days)
I have a vector with timecourse (1 Hz) data that gradually rises then levels over time. The data has some natural variability and occasional artifacts that appear as one or two sample peaks or valleys. The variability is useful, but I want to get rid of the artifacts. A 'rule' of the data is that changes of more than 2.6 between adjacent samples cannot possibly be valid, and must therefore be artifacts. I have tried a first order Butterworth at .2, and median filter with 3 samples, and they both get rid of the spikes but flatten the variability. In this instance, the median gives a better result. As I understand things, band-pass filters don't make sense since specific amplitudes are not a problem per se, but only if they differ too greatly from their immediate neighbours.
What I would like is some kind of moving window filter that would apply only to amplitude changes more than the 2.6 allowable, but not flatten smaller spikes.
so in the following example, the 160 in the middle and the 148 at the end are suspects, but the rest of the data should just be left alone.
HR=[136 137 138 140 141 142 143 143 144 144 145 146 160 144 143 143 142 143 143 144 145 146 145 148];
[b,a]=butter(1,.2); %1st order butterworth,
butterHR=filter(b,a,HR);
gives
butterHR=[33.35 83.94 110.21 124.33 132.26 136.79 139.59 141.26 142.36 143.16 143.82 144.64 148.74 150.34 146.99 145.03 143.79 143.16 143.08 143.29 143.88 144.68 145.08 145.78]
and
medHR=medfilt1(HR,3); % 3 points averaged
gives
medHR=[136 137 138 140 141 142 143 143 144 144 145 146 146 144 143 143 143 143 143 144 145 145 146 145];
The butter smooths everything a little too much, and the median does a great job of getting rid of the 160 and 148, but it also changes samples 18, 22 and 23. In my perfect world, I would leave those alone.
Any thoughts?

Accepted Answer

Star Strider
Star Strider on 9 Jul 2015
Another option is to delete the out-of-range values and interpolate (or extrapolate) to fill them:
HR=[136 137 138 140 141 142 143 143 144 144 145 146 160 144 143 143 142 143 143 144 145 146 145 148];
eHR = HR; % ‘HR’ Vector For Interpolation
ix1 = 1:length(HR); % Original Index Vector
eix1 = ix1; % Index Vector For Interpolation
dHR = [0 diff(HR)]; % Differences Between Adjacent Elements
Q1 = (dHR > 2.6);
eHR(dHR > 2.6) = []; % Delete Data With Differences > 2.6
eix1(dHR > 2.6) = []; % Delete Corresponding Indices
HR_new = interp1(eix1, eHR, ix1, 'linear', 'extrap'); % Interpolate To Fill Deleted ‘HR’ Values
new_ix1 = setdiff(ix1, eix1); % To Plot Interpolated Points
figure(1)
plot(ix1, HR, 'bp')
hold on
plot(new_ix1, HR_new(new_ix1), 'gp') % Interpolated Points
plot(ix1, HR_new, '-g')
hold off
grid
legend('Original Data', 'Interpolated Data')
This is my preference with my data, but your requirements may be different. The HR_new assignment contains the original data and interpolated values. (The ‘new_ix1’ assignment and its associated plot call are simply there to display the interpolated values, and are not necessary for the code.)
  2 Comments
Blake
Blake on 23 Jul 2015
Star Strider, thanks. That is pretty much the answer I eventually came up with too, but unlike your elegant solution in section two, I used a loop to assemble the values to eliminate and interpolate, since there are a few conditionals that I needed to account for. (I may go back and see if there is a way I can use your idea to achieve the same end without a loop.)
I used cubic interpolation in my case because it fit the expected data much more exactly.
Thanks again!
Star Strider
Star Strider on 23 Jul 2015
As always, my pleasure!
You might well be able to vectorise your conditionals, since MATLAB makes that relatively straightforward. I’ll help as I can.

Sign in to comment.

More Answers (1)

Thorsten
Thorsten on 8 Jul 2015
HR=[136 137 138 140 141 142 143 143 144 144 145 146 160 144 143 143 142 143 143 144 145 146 145 148];
dHR = diff([HR(1) HR])
plot(HR(dHR<2.6))
  2 Comments
Blake
Blake on 8 Jul 2015
Thorsten, that is cool and very elegant, however it removes the samples, which I can't do because each sample represents time. So the 160 and 148 should be changed to a median of their neighbouring values.
Thorsten
Thorsten on 9 Jul 2015
First find the outliers and set them to NaN
dHR = diff([HR(1) HR])
HR(dHR >= 2.6) = NaN;
You could only plot HR now; the size remains intact, but the there are white holes at the outliers. To fill these with the median of the neighboring values, we compute the position of the neighbors:
ind = find(isnan(HR))';
indx = [ind-1, ind+1]; % neighbor indices
indx(indx<1) = 1; % make sure the indices are > 0
indx(indx>numel(HR)) = numel(HR); % ... and not higher than the number of elements in HR
Replace the NaN with the median of the neighbors.
HR(ind) = nanmedian(HR(indx), 2);

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!