Comparing Timeseries to get similar Timeseries based on Euclidean Distance

10 views (last 30 days)
I have timeseries data in an array which I want to compare in order to build clusters of similar time series.
Generate sample data using the following piece of code:
timeseries = [1, 2, 3, 4; 1, 2, 3, 4; 1, 2, 3, 4; 4, 5, 6, 7; 4, 5, 6, 8; 4, 5, 6, 9; 4, 5, 6, 10];
Here we have 7 timeseries where each row represent a timeseries and each column represents the timestamp.
First I compute the eucledian distance of the data generated above. This can be done through
distance = squareform(pdist(timeseries));
From the above distance matrix we can find out unique distances by code below
unique_distances = unique(distance);
I want to create a n (number of time series i.e 4) by m (number of unique distances i.e. 8). See below
t1 , t2 .. represent time series 1, 2 and so on.
First row and first column of the matrix would show how many timeseries have zero distance with the first time series and so on so forth.
First row and second column of matrix represent how many timeseries have distance of 1 with first timeseries and so on and so forth.
I am new to MATLAB I've done the desired result using code below;
dist = nan(size(timeseries, 1), size(unique_distances,1));
for i = 1:size(timeseries, 1)
disp(i)
for j = 1:size(unique_distances,1)
disp(j)
dist(i,j) = sum(distance(i,:) == unique_distances(j));
end
end
I am looking for a vectorised approach for above code.
Also I need to cluster based on time series which has zero distance with maximum number of other time series therefore I need to sort the matrix based on that as well. In this example it is already sorted as t1 had distance of zero with 3 timeseries as it can be seen from the matrix. an 3 is the max value aswell.
  2 Comments
Ameer Hamza
Ameer Hamza on 26 Dec 2020
You mentioned, "I want to create a n (number of time series i.e 4) by m (number of unique distances i.e. 8)." But the matrix you create has seven rows. Are time series arranged along with columns or rows? I think you intend to take the transpose of the matrix before passing it to pdist()
distance = squareform(pdist(timeseries.'));
Furqan Hashim
Furqan Hashim on 27 Dec 2020
You've correctly pointed out the mistake. I've edited my question where I've rephrased
"Here we have 4 timeseries where each column represent a timeseries and each row represen the timestamp."
to
"Here we have 7 timeseries where each row represent a timeseries and each column represents the timestamp."
Now we do not need to take the transpose, for simplicity we can consider each row represents a timeseries instead of each column.

Sign in to comment.

Answers (0)

Categories

Find more on Time Series in Help Center and File Exchange

Products


Release

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!