How to remove jumps and bring the data down?

I need a way to deal with spikes and jumps in the data.
The spikes should be replaced with NaNs and the data after the jump should be brought down to the earlier data level.
The 'hampel' function and 'deleteoutliers' too didn't work.
I have attached a sample data herewith.
Please let me know how to deal with such data set.
Thank you.

1 Comment

My favorite method of dealing with jumps is to
1. Standardize to zero-mean/unit-variance
2. Check points with first-differences greater than
a threshold
3. Replace the outliers using the mean of the
surrounding data points.
Hope this is helpful
Greg

Sign in to comment.

Answers (2)

You can try a modified median signal. This is where you replace the signal by the median value of the signal in a window around any point where the different between the signal and the median filtered version of the signal is more than some amount. Here is a demo:
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format short g;
format compact;
fontSize = 13;
% Read in signal.
s = load('spike_jump.mat')
spike_jump = s.spike_jump;
% Plot the signal.
subplot(4, 1, 1);
plot(spike_jump, 'b*-', 'LineWidth', 2);
grid on;
xlabel('x', 'FontSize', fontSize);
ylabel('spike_jump', 'FontSize', fontSize);
%------------------------------------------------------------------------------
% Set up figure properties:
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'OuterPosition', [0, 0.04, 1, 0.96]);
% Get rid of tool bar and pulldown menus that are along top of figure.
% set(gcf, 'Toolbar', 'none', 'Menu', 'none');
% Give a name to the title bar.
set(gcf, 'Name', 'Demo by ImageAnalyst', 'NumberTitle', 'Off')
% Take the median filter of the signal
windowWidth =75;
filteredSignal = medfilt1(spike_jump, windowWidth);
% Plot the signal.
subplot(4, 1, 2);
plot(filteredSignal, 'b-', 'LineWidth', 2);
grid on;
xlabel('x', 'FontSize', fontSize);
ylabel('filteredSignal', 'FontSize', fontSize);
% Compute the absolute deviation:
mad = abs(spike_jump - filteredSignal);
% Plot the signal.
subplot(4, 1, 3);
plot(mad, 'b-', 'LineWidth', 2);
grid on;
xlabel('x', 'FontSize', fontSize);
ylabel('Median Absolute Deviation', 'FontSize', fontSize);
% h = histogram(mad)
% Replace elements with mad > 20 by the median.
outputSignal = spike_jump; % Initialize
mask = mad > 20;
outputSignal(mask) = filteredSignal(mask);
% Plot the signal.
subplot(4, 1, 4);
plot(outputSignal, 'b-', 'LineWidth', 2);
grid on;
xlabel('x', 'FontSize', fontSize);
ylabel('Repaired Signal', 'FontSize', fontSize);

2 Comments

Venkata
Venkata on 26 Dec 2017
Edited: Venkata on 26 Dec 2017
Thank you so much for your detailed explanation.
Is there any possibility to remove the jump and bring the later data down to their earlier data range?
Any approach/method that can be applied only for the data around the jump, if not the entire data set..?
Thank you.
Add these lines to the code:
% Find out indexes where the signal is more than 500.
stepIndexes = outputSignal > 500;
% Find the mean of those values.
meanStep = mean(outputSignal(stepIndexes))
% Assume the mean should be subtracted from all indexes where the value is more than 500.
outputSignal(stepIndexes) = outputSignal(stepIndexes) - meanStep;
% Plot the signal.
subplot(4, 1, 4);
plot(outputSignal, 'b-', 'LineWidth', 2);
grid on;
xlabel('x', 'FontSize', fontSize);
ylabel('Repaired Signal', 'FontSize', fontSize);

Sign in to comment.

I tried this with hampel and finchangepts. Seemed to work okay.
% load data attached at top of question
load spike_jump
% get a crude estimate
yest = hampel(spike_jump,40,.5);
plot(yest)
% get breakpoint of shelf
idx = findchangepts(yest);
% correct baseline of left and right portions
% The transition takes a few samples so we will guard by 10 samples
yleft = spike_jump(1:idx) - mean(yest(1:idx-10));
yright = spike_jump(idx+1:end) - mean(yest(idx+10:end));
y = [yleft; yright];
% re-filter and plot
yf = hampel(y,120,.2);
plot(yf);

7 Comments

@Gerg Dionne, thank you very much. Yes it seems to be successful in removing the jumps.
But at places it is altering the data pattern too, I think.
I attached a screenshot showing both original(left) and corrected (right) data, for your perusal.
Could you please let me know if this is normal..!
But Venkata, that is a 100% completely, totally different type of signal!!! That looks nothing at all like the signal you asked us to design the filter for, so why are you surprised?
Hampel filtering works particularly well on high-frequency perturbations of short durations about an otherwise constant signal. If you can somehow come up with a reasonable baseline, you can subtract it off your original signal, run the filter, then add your baseline back.
Perhaps if you gave a little background on your data and what you are trying to do we could suggest something else to try.
@ Image Analyst: The above shown is a part of the same signal. I was checking the affects of the filtering and then I observed this.
@ Greg Dionne: This is the time series of a potential field measurement at a static location. Those jumps seem to the instrumental errors and they are many. That is why I wanted an algorithm to mitigate their affect as they obscure the actual signal.
@Image Analyst and @ Greg Dionne: Thank you so much both for your valuable time, concern and suggestions. Let me try to come up with a combined approach of your suggestions.
Up above, Greg Heath recommended a method like mine (modified median filter) but mentioned mean instead of median. For your second type of signal you showed here, you can use a moving polynomial fit, called the Savitzky-Golay filter, done in MATLAB by sgolayfilt() in the Signal Processing Toolbox. For order 1, it's like conv() and a moving mean. This type of filter can "hug" the signal or "smooth" the signal as much or little as you want. To keep the "good" elements unaltered, you can replace only those elements where the fit is far away from the actual signal, like what Professor Heath recommended. Though, using the median is less susceptible to outliers. With the mean, the father away the outlier is, the more it affects the mean, but it won't affect the median, which is an advantage of the median in my opinion.
Hopefully the error signal introduced in the instrumentation will be separable enough from the field you are trying to measure. I think Image Analyst's sgolayfilt approach to first remove the trend is worth a try. Let us know how it goes.
If you have a recent copy of MATLAB (R2017a) you can also try using filloutliers (which has implemented part of Greg Heath's approach as well).
I agree with Image Analyst that the sliding median is more stable than the sliding mean.HOWEVER, the replacement mean to which I referred is post outlier removal mean.
I didn't invent the technique. This was common back in the pre-deskcomputer days when calculating a median was much more of a pain in the butt than calculating an average.
Greg ( Another wannabe stable genius !)
( WHAAT? ... OPRAH for president? ... SHEESH ! )

Sign in to comment.

Categories

Asked:

on 24 Dec 2017

Edited:

on 9 Jan 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!