MATLAB Answers

vectorize a for loop

5 views (last 30 days)
Sam
Sam on 26 Oct 2012
I am finding a optimal way of vectorizing the following piece of code (of course the real data is much greater than this):
start_peak_hour = [3 10 20 50];
end_peak_hour = [5 14 27 70]; % always same size as start_peak_hour
n_peak = numel(start_peak_hour);
isPeak = zeros(1,80);
for j=1:n_peak,
isPeak(start_peak_hour(j):end_peak_hour(j)) = 1;
end
Your two cents are appreciated
-Sam
  1 Comment
José-Luis
José-Luis on 26 Oct 2012
Loops are not always evil, and in this case probably even faster than the alternatives.

Sign in to comment.

Accepted Answer

Sean de Wolski
Sean de Wolski on 26 Oct 2012
Edited: Sean de Wolski on 26 Oct 2012
I highly doubt you'll find anything faster than that well-written for-loop. The JIT will pick that up and have wonders with it.
eval is evil and will be orders of magnitude slower and hideous/hard to follow.
start_peak_hour = [3 10 20 50];
end_peak_hour = [5 14 27 70]; % always same size as start_peak_hour
n_peak = numel(start_peak_hour);
t1 = 0;
t2 = 0;
for tt = 1:50 %50x
tic
isPeak = zeros(1,80);
for j=1:n_peak,
isPeak(start_peak_hour(j):end_peak_hour(j)) = 1;
end
t1 = t1+toc;
tic
eval(sprintf('isPeak(start_peak_hour(%d):end_peak_hour(%d)) = 1,',[1:n_peak;1:n_peak]))
t2=t2+toc;
end
[t1 t2] % 0.0010 0.0681
t1/t2
On my system the FOR-loop is 70x faster (for this small problem), it'll get better with bigger ones. And it doesn't use eval or any other impossible to understand syntax.
My $0.05 - a full nickel :) .
  6 Comments
Jan
Jan on 27 Oct 2012
@Azzi: I prefer to write fast code for all cases. A useful piece of code is used in larger programs also and then the total speed is limited by the most time consuming parts. A not useful piece of code is deleted soon.
A severe problem is the dependency to the problem size: while implementing a quicksort seems to be efficient, it is a bad idea for lists of less than 10 elements. Calculating a SUM using multiple threads and all cores is reasonable for large arrays, it is a waste of time for tiny arrays.
So I respond: We always need "speed", but the "speed" of an algorithm is not well defined.

Sign in to comment.

More Answers (5)

Matt J
Matt J on 26 Oct 2012
Edited: Matt J on 26 Oct 2012
Here's a way that avoid both EVAL and for-loops
isPeak = sparse(80,1,sum(end_peak_hour-start_peak_hour+1));
ispeak(start_peak_hour)=1;
ispeak(end_peak_hour+1)=-1;
ispeak=cumsum(ispeak);
  8 Comments
Azzi Abdelmalek
Azzi Abdelmalek on 26 Oct 2012
Yes, my tes is'nt adequate, like you said, I must make a test in a loop.

Sign in to comment.


Azzi Abdelmalek
Azzi Abdelmalek on 26 Oct 2012
eval(sprintf('isPeak(start_peak_hour(%d):end_peak_hour(%d)) = 1,',[1:n_peak;1:n_peak]))
  3 Comments
Jan
Jan on 26 Oct 2012
And you should avoid EVAL...

Sign in to comment.


Jan
Jan on 26 Oct 2012
Edited: Jan on 27 Oct 2012
[EDITED] See Matt J's answer, which I had overseen during typing.
start_peak_hour = [3 10 20 50];
end_peak_hour = [5 14 27 70]; % always same size as start_peak_hour
isPeak = zeros(1,80);
isPeak(start_peak_hour) = 1;
if end_peak_hour(end) == 80
end_peak_hour(end) = [];
end
isPeak(end_peak_hour + 1) = -1;
isPeak = cumsum(isPeak);
Please measure the timings with your original data set.
[EDITED] A new approach:
[EDITED 2]: Bugs fixed, now it compiles:
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[]) {
mwSize i, n, len;
double *ini, *fin, *r, *q, *qf;
ini = mxGetPr(prhs[0]);
fin = mxGetPr(prhs[1]);
n = mxGetNumberOfElements(prhs[0]);
len = (mwSize) mxGetScalar(prhs[2]);
plhs[0] = mxCreateDoubleMatrix(1, len, mxREAL);
r = mxGetPr(plhs[0]) - 1;
for (i = 0; i < n; i++) {
q = r + (mwSize) ini[i];
qf = r + (mwSize) fin[i];
while(q <= qf) {
*q++ = 1.0;
}
}
return;
}
In a real-world program checks of the inputs are OBLIGATORY.
Compile it, call it as:
myPeakFiller(start_peak_hour,end_peak_hour,80);
  3 Comments
Matt J
Matt J on 26 Oct 2012
At least I've considered the last element...
True!

Sign in to comment.


Chris A
Chris A on 26 Oct 2012
start_peak_hour = [3 10 20 50]';
end_peak_hour = [5 14 27 70]'; % always same size as start_peak_hour
peak_size = end_peak_hour - start_peak_hour + 1;
ncols=max(peak_size);
nrows=numel(peak_size);
tmpmat=kron(start_peak_hour,ones(1,ncols))+(cumsum(ones(nrows, ncols), 2)-1),
lbmat=kron(start_peak_hour-1, ones(1,ncols));
ubmat=kron(end_peak_hour+1, ones(1,ncols));
u=tmpmat((tmpmat>lbmat)&(tmpmat<ubmat));
isPeak = zeros(1,80);
isPeak(u) = 1;
horzcat((1:80)',isPeak'),

Jan
Jan on 27 Oct 2012
Edited: Jan on 27 Oct 2012
List of solutions:
function isPeak = Loop_(a, b, len)
isPeak = zeros(1, len);
for j = 1:length(a)
isPeak(start_peak_hour(j):end_peak_hour(j)) = 1;
end
%
function isPeak = CumSum_(a, b, len)
isPeak = zeros(1,80);
isPeak(a) = 1;
if b(end) == 80
b(end) = [];
end
isPeak(b + 1) = -1;
isPeak = cumsum(isPeak);
%
function isPeak = Array_(a, b, len)
isPeak(cell2mat(arrayfun(@(x,y) x:y, a, b, 'un', 0))) = 1;
%
function isPeak = Eval_(a, b, len)
eval(sprintf('isPeak(a(%d):b(%d)) = 1;' , [1:n_peak;1:n_peak]));
%
function isPeak = CumSum2_(a, b, len)
e = ones(1, n_peak);
nzmax = sum(b - a + 1);
isPeak = cumsum(sparse([a, b+1],1, [e, -e], len, 1, nzmax));
%
% And the C-Mex I've posted before
A larger test data set:
start_peak_hour = [3 10 20 50];
end_peak_hour = [5 14 27 70];
a = bsxfun(@plus, start_peak_hour', 0:100:100000);
a = a(:)';
b = bsxfun(@plus, end_peak_hour', 0:100:100000);
b = b(:)';
len = b(end) + 10;
Timings:
tic; for k = 1:1000, isPeak = FUNC(a, b, len); end; toc
Loop_: 4.75 sec
CumSum_: 1.19 sec
Array_: 37.08 sec
Eval_: 1224.83 sec
CumSum2_: 4.09 sec
Mex: 0.42 sec

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!