MATLAB Answers

Martin
0

Can this for-loop code get faster in some way?

Asked by Martin
on 5 Oct 2019
Latest activity Commented on by darova
on 6 Oct 2019
I got a big result table named resTbl.
There I need for each row to grab a timestamp (posix time) and construct a period interval vector (I use a constant, ts_length, to construct this).
Then I need to find all those in period_interval that are represented in the variable data (column 2), and take the sum of those in data (column 7).
The below code works as I want it to:
for i = 1 : size(resTbl ,1)
period_interval = resTbl(i,2) : 60000 : resTbl(i,2) + ts_length;
[hd, he] = ismember(period_interval,data(:,2));
resTbl(i,10) = sum(data(he(hd),7));
end
The problem is that it is slow since resTbl has many rows. Does anyone have a suggestion how to make it faster?

  1 Comment

Sign in to comment.

1 Answer

Answer by darova
on 5 Oct 2019
Edited by darova
on 5 Oct 2019

Maybe ismember function can be replaced:
dt = 60000;
period_interval = 0 : dt : ts_length;
n = length(period_interval);
for i = 1 : size(resTbl ,1)
cond = ~mod( data(:,2)-resTbl(i,2),dt ); % multiple by 60 000
mult = (data(:,2)-resTbl(i,2))/dt;
ind = (0 <= mult & mult <= n) & cond; % (0 <= multiplier <= n) and multiple by 60 000
% [hd, he] = ismember(period_interval,data(:,2));
resTbl(i,10) = sum(data(ind,7));
end

  4 Comments

Show 1 older comment
What about this?
dt = 60000;
period_interval = 0 : dt : ts_length;
n = length(period_interval);
[D,R] = ndgrid(data(:,2),resTbl(:,2)); % 2D matrix
cond = ~mod( D-R,dt ); % multiple by 60 000
mult = (D-R)/dt; % multiplier
ind = (0 <= mult & mult <= n) & cond; % (0 <= multiplier <= n) and multiple by 60 000
ndata = ind .* repmat( data(:,7), [1 size(resTbl,1)] );
resTbl(:,10) = sum(ndata)';
Pretty brilliant solution I really have to admit!
It is faster but unfortunately it is however also pretty time consuming and VERY heavy on the memory due to the matrices!
With my current memory, 32 GB, I can not run the full resTbl set (3400 rows) on the dataset which is like 2 million rows and 7 columns.
I tried with smaller resTbl and data set, and for some reason I had to alter n to this: n = length(period_interval)-1; to match my own results.
I am not even sure if there exist a better solution to this problem than yours. I gladly hear from you again, but otherwise I say thank you very much!
2 million rows and 7 columns
Maybe time is a price in this case

Sign in to comment.