How to remove artifacts from a signal
Show older comments
Hello,
I would like to remove artifacts from a signal, typically a lack of information in a thermocouple signal due to a shortcut. In fact I don't want to calculate the moving average because it would alter the "good information" and extrapolate data that don't exist. But I want to detect wrong information (artifacts) based on this moving average, and replace it by NaN or 0.

I tried to code it, but it is not efficient at all (based on for loops), and I am sure I am not the first one to do this ;)
Thanks a lot for your help!
Max
Accepted Answer
More Answers (4)
Why not simply:
d = [0; diff( signal )] ;
isValid = ~logical( cumsum( -sign( d ) .* (abs( d ) > 10) )) ;
then e.g. set invalid entries to NaN
signalCut = signal ;
signalCut(~isValid) = NaN ;
or interpolate:
t = 1 : numel( signal ) ;
signalClean = interp1( t(isValid), signal(isValid), t ) ;
This is a quick technical trick that you have to tweak and check visually. It is not robust (in comparison to the two other answers), but if all you want to do is a one shot elimination of the glitches ..
With that you get:
subplot( 3, 1, 1) ; plot( signal, 'b', 'LineWidth', 2 ) ;
grid ; title( 'Signal' ) ;
subplot( 3, 1, 2 ) ; plot( isValid, 'r', 'LineWidth', 2 ) ;
grid ; title( '"is valid"' ) ;
subplot( 4, 1, 3) ; plot( signalCut, 'g', 'LineWidth', 2 ) ;
grid ; title( 'Cut signal' ) ;
subplot( 4, 1, 4) ; plot( signalClean, 'm', 'LineWidth', 2 ) ;
grid ; title( 'Cleaned signal' ) ;

Star Strider
on 23 Sep 2017
Try this:
D = load('signal.mat');
signal = D.signal;
t = (0:length(signal)-1)'; % Create Time Vector
p = polyfit(t, signal, 2); % Polynomial Fit For Detrending
dtrnd_sig = signal - polyval(p, t); % Detrend
t2 = t; % Temporary Time Vector
signal(dtrnd_sig < -100) = []; % Set Signal ‘Dips’ To ‘Empty’
t2(dtrnd_sig < -100) = []; % Set Corresponding Time Of ‘Dips’ To ‘Empty’
signal_new = interp1(t2, signal, t, 'linear'); % Interpolate To Fill Missing Data
figure(1)
plot(t,signal_new,'-r')
There may be other ways to accomplish this objective. I used linear interpolation, 'spline' is another option you may want to experiment with, as is the Signal Processing Toolbox resample function.
7 Comments
Star Strider
on 23 Sep 2017
My usual approach to problems such as yours is to design a frequency-selective filter, probably a lowpass filter. That is probably the best solution for your signal, since the signal itself appears to have a very low frequency component, with the ‘dropouts’ being high frequency.
I would have done this for your signal, however discrete filter design requires either a corresponding time vector, or a known sampling frequency (assuming a regularly-sampled signal).
The filter approach is likely the best option.
I can design and test a filter for your signal if you provide the sampling frequency of your signal, and a typical complete signal rather than a sample.
maxroucool
on 24 Sep 2017
Star Strider
on 24 Sep 2017
This was something of an adventure! My filter idea didn’t work as well as I would have liked, so I went with the interpolation strategy.
The Code —
filename = 'signalUpdate.mat';
D = load(filename);
signal = D.signal;
N = size(signal,1);
t = (0:N-1)'; % Create Time Vector (sec)
nnan = nnz(isnan(signal)); % Number Of ‘NaN’ Values (Diagnostic Only)
nzeo = nnz(signal == 0); % Number Of Zero Values (Diagnostic Only)
figure(1)
plot(t, signal) % Time Domain Plot
grid
nr_sig = size(signal,2);
for k1 = 1:nr_sig
Csignal{k1} = signal(~isnan(signal(:,k1)),k1); % Create Cell Arrays For Signals
Ct{k1} = t(~isnan(signal(:,k1))); % Time Vector Cell Array
end
Threshold = 50;
for k1 = 1:nr_sig
Msignal = Csignal{k1}; % Create Copy
MTime = Ct{k1}; % Create Copy
Mask = Msignal < Threshold;
Msignal(Mask) = []; % Set Signal ‘<Threshold’ To ‘[]’
MTime(Mask) = []; % Set Time ‘<Threshold’ To ‘[]’
CIsignal{k1} = interp1(MTime, Msignal, Ct{k1}, 'linear'); % Interpolate
end
figure(2)
for k1 = 1:nr_sig
subplot(nr_sig, 1, k1)
plot(Ct{k1}, CIsignal{k1})
grid
end
This creates a copy of the original data in cell array ‘Csignal’, and then interpolates it to create a new cell array ‘CIsignal’ with the discontinuities removed. The result in ‘CIsignal’ appears to be what you want. The ‘Ct’ cell arrays are the corresponding time vectors for ‘Csignal’ and ‘CIsignal’.
The cell arrays have the advantage here of containing only the data, and not the NaN values at the end (apparently to create equal-length vectors for the matrix).
My code is robust to the data you provided. I assume they are representative of all your data. If so, it should be robust to your other data as well.
Note that the ‘nnan’ and ‘nzro’ are for information only and are not necessary for the code to run. I used them to learn some of the characteristics of your data. You can delete them. I left them in my posted code in the event you need them.
You can create a function from my code if you want to. See the documentation on Function Basics (link) for details. I will help as needed.
Image Analyst
on 24 Sep 2017
Max, let's remove the ambiguity. Do you want to
- "replace it by NaN or 0" which you asked for, OR
- fix/repair the bad places by interpolation, which might be more useful?
#1 will either show gaps (no plotting) if you replaced by nan, or go (plot) all the way down to the x axis if you replaced by 0. #2 will plot in the "bad regions" a smooth signal. I did #1 while Star did #2. Which way do you want it, Max?
Star Strider
on 24 Sep 2017
Actually, my code eliminates the artefacts first, (creating ‘Msignal’ and the corresponding ‘MTime’ that are temporary and could be saved if desired), then uses those to do the interpolation.
There is nothing statistically ‘wrong’ with interpolating data if there are abrupt discontinuities due to ‘dropped’ data, and the underlying trend is neither significantly noisy nor changing rapidly. Those are true for the data presented, so interpolation is acceptable here.
maxroucool
on 24 Sep 2017
Star Strider
on 24 Sep 2017
Edited: Star Strider
on 25 Sep 2017
Our pleasure!
If you do not want to interpolate, change this part of my code to output the non-interpolated data and time vectors as ‘SplitSignal’:
for k1 = 1:nr_sig
Msignal = Csignal{k1}; % Create Copy
MTime = Ct{k1}; % Create Copy
Mask = Msignal < Threshold;
Msignal(Mask) = []; % Set Signal ‘<Threshold’ To ‘[]’
MTime(Mask) = []; % Set Time ‘<Threshold’ To ‘[]’
CIsignal{k1} = interp1(MTime, Msignal, Ct{k1}, 'linear'); % Interpolate
SplitSignal{k1} = [MTime, Msignal]; % Store Time & Signal Without Interpolation
end
Image Analyst
on 23 Sep 2017
Edited: Image Analyst
on 23 Sep 2017
You can detect "bad" elements either by computing the MAD and then thresholding to identify them, and then replacing them with zero or NAN (whichever you want)
badIndexes = abs(signal - conv(signal, ones(1, windowWidth)/windowWidth, 'same')) > 10;
signal(badIndexes) = nan; % or 0
2 Comments
Image Analyst
on 23 Sep 2017
Two lines where I blur and subtract are too complex? Let me make it longer where I make intermediate signals where you can see what's going on. Bad locations are nans, as you requested, so those points are not plotted so that's why there are "gaps" in the curve, but that's what you asked for.
s = load('signal.mat')
signal = s.signal;
subplot(4, 1, 1);
plot(signal, 'b-');
grid on;
% Now fix the signal.
windowWidth = 231;
kernel = ones(1, windowWidth) / windowWidth;
blurredSignal = conv(signal, kernel, 'same');
subplot(4, 1, 2);
plot(blurredSignal, 'b-');
grid on;
mad = abs(signal - blurredSignal);
subplot(4, 1, 3);
plot(mad, 'b-');
grid on;
badIndexes = mad > 10;
fixedSignal = signal; % Initialize.
fixedSignal(badIndexes) = nan; % or 0
subplot(4, 1, 4);
plot(fixedSignal, 'b-');
grid on;

maxroucool
on 24 Sep 2017
maxroucool
on 23 Sep 2017
0 votes
1 Comment
Jan
on 27 Sep 2017
@maxroucool: Please post comments in the section for comments. It is not clear, to which answer this comment belongs.
Categories
Find more on Vibration Analysis in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!