reduce density of a time series

I have a time series a of a river discharge with over 700000 points. I'd like to "clean" the data and delete all irrelevant points (points which share almost the same value with their neigbouring points). I imagined a code which compares two values (e.g. a_i - a_i+1) and if the result is below a certain threshold the second value gets deleted.
I already tried functions like downsampling, but I am afraid that I will loose important information about minima and maxima. Is there maybe somewhere a link to this question?

3 Comments

There's lots of ways to do this, but they kind of depend on the characteristics of your signal. It may be the case that downsampling, smoothing, or other time-domain filtering is appropriate. Or it could be frequency-domain filtering. I wouldn't perform an algorithm like you suggested, where arbitrary data points get deleted, because that would lead to non-uniform sampling and it's always nice to have uniform sampling if possible.
Could you post the results of the following code using your data? X is your time series and fs is the sampling rate. If you don't know the sampling rate, but you have a uniform time vector, set fs = 1/mean(diff(timevector));
function plotFFT(X,fs)
T = 1/fs; % sampling period
Y = fft(X); % fourier transform
L = length(X); % length of signal
t = (0:L-1)*T; % time vector
f = (0:L-1)/L * fs; % frequency vector
f(ceil(L/2)+1:end) = f(ceil(L/2)+1:end)-fs;
% get positive frequencies only
ff = f(1:ceil(L/2));
YY = abs(Y(1:ceil(L/2)));
figure
subplot(2,1,1)
plot(t,X)
xlabel('Time [s]')
ylabel('Signal')
axis tight
subplot(2,1,2)
plot(ff,YY)
xlabel('Frequency [Hz]')
ylabel('|P(f)|')
axis tight
end
Hello,
thanks for the help. The plottet result is here.
The values of the second figure range inbetween 10^10 (at the beginning) and 10^5 (shortly after).
figure.png
Daniel M
Daniel M on 12 Nov 2019
Edited: Daniel M on 12 Nov 2019
Actually if you don't mind can you run it again but this time change the bottom plot to semilogy and detrend your data when you input it using detrend(X).
Also it wouldn't hurt if you could draw or otherwise indicate what you would like your output signal to look like. Is it just a smoother version of the original?
(Or I could look at your data if you uploaded).

Sign in to comment.

Answers (1)

Adam Danz
Adam Danz on 11 Nov 2019
Edited: Adam Danz on 12 Nov 2019
" I imagined a code which compares two values (e.g. a_i - a_i+1) and if the result is below a certain threshold the second value gets deleted."
The diff() function does the first half of your description. I'll assume your dates are stored in a column vector named "dates" and they are in datetime format. If they are not in datetime format some very small modifications will need to be made.
theshold = hours(1); %can be any duration: minutes(30), hours(12) days(2), etc....
rm = [false; abs(diff(dates))<threshold]; %logical vector of datetimes to remove
datesClean = dates(~rm);
% OR
% dates(rm) = []; % to keep variable name
If you're working with a timetable, you'll apply the rm vector to the rows of the table.

2 Comments

This idea should work even if OP is not talking about a timeseries object, but just regular data as a function of time.
Adam Danz
Adam Danz on 11 Nov 2019
Edited: Adam Danz on 11 Nov 2019
Yeah; it's been my experiences that the term "time series" is more generally used by folks in this forum to describe their data more often than describing the use of Matlab's timeseries objects. Unless they specifically indicate one or the other, I assume they are speaking more generally. Good point!

Sign in to comment.

Categories

Products

Release

R2019b

Asked:

on 11 Nov 2019

Edited:

on 12 Nov 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!