## Can this for-loop code get faster in some way?

### Martin (view profile)

on 5 Oct 2019
Latest activity Commented on by darova

### darova (view profile)

on 6 Oct 2019
I got a big result table named resTbl.
There I need for each row to grab a timestamp (posix time) and construct a period interval vector (I use a constant, ts_length, to construct this).
Then I need to find all those in period_interval that are represented in the variable data (column 2), and take the sum of those in data (column 7).
The below code works as I want it to:
for i = 1 : size(resTbl ,1)
period_interval = resTbl(i,2) : 60000 : resTbl(i,2) + ts_length;
[hd, he] = ismember(period_interval,data(:,2));
resTbl(i,10) = sum(data(he(hd),7));
end
The problem is that it is slow since resTbl has many rows. Does anyone have a suggestion how to make it faster?

per isakson

on 5 Oct 2019

on 5 Oct 2019
Edited by darova

### darova (view profile)

on 5 Oct 2019

Maybe ismember function can be replaced:
dt = 60000;
period_interval = 0 : dt : ts_length;
n = length(period_interval);
for i = 1 : size(resTbl ,1)
cond = ~mod( data(:,2)-resTbl(i,2),dt ); % multiple by 60 000
mult = (data(:,2)-resTbl(i,2))/dt;
ind = (0 <= mult & mult <= n) & cond; % (0 <= multiplier <= n) and multiple by 60 000
% [hd, he] = ismember(period_interval,data(:,2));
resTbl(i,10) = sum(data(ind,7));
end

Show 1 older comment
darova

### darova (view profile)

on 5 Oct 2019
dt = 60000;
period_interval = 0 : dt : ts_length;
n = length(period_interval);
[D,R] = ndgrid(data(:,2),resTbl(:,2)); % 2D matrix
cond = ~mod( D-R,dt ); % multiple by 60 000
mult = (D-R)/dt; % multiplier
ind = (0 <= mult & mult <= n) & cond; % (0 <= multiplier <= n) and multiple by 60 000
ndata = ind .* repmat( data(:,7), [1 size(resTbl,1)] );
resTbl(:,10) = sum(ndata)';
Martin

### Martin (view profile)

on 6 Oct 2019
Pretty brilliant solution I really have to admit!
It is faster but unfortunately it is however also pretty time consuming and VERY heavy on the memory due to the matrices!
With my current memory, 32 GB, I can not run the full resTbl set (3400 rows) on the dataset which is like 2 million rows and 7 columns.
I tried with smaller resTbl and data set, and for some reason I had to alter n to this: n = length(period_interval)-1; to match my own results.
I am not even sure if there exist a better solution to this problem than yours. I gladly hear from you again, but otherwise I say thank you very much!
darova

### darova (view profile)

on 6 Oct 2019
2 million rows and 7 columns
Maybe time is a price in this case