# How can I speed up these for loops?

25 views (last 30 days)
Timothy Plummer on 1 Jun 2017
Edited: Kelly Kearney on 14 Jun 2017
I have a group of 3 nested for loops that will calculate how much light is hitting a rectangular array of points if I know where the fixtures are located, in relation to the array of points. The results are used to determine how to best arrange a group of the same fixture in a space so there is a uniform amount of light at 5 different light levels, and at 8 different mounting heights, and it generally takes 10 to 20 calculations to find the best arrangement. At worst case scenario this is running 5*8*20 (800) times.
In the following code 'rows' and 'columns' are vectors of the same size that contains x and y coordinates of the of calculation points I need. 'xFixtureLocations' and 'yFixtureLocations' are vectors of the same size that contain the x and y coordinates of the fixtures. Finally, there is a struct 'IESdata' that contains 3 variables 'HorizAngles' (a vector of size 37), 'VertAngles' (a vector of size 37), and, 'photoTable' (an array of size m x n) that describes the intensity of the light in the steradian centered on the 'HorizAngles(m)' and 'VertAngles(n)' for the fixture in the calculation.
As of right now, I know that the calculation that is generated by this system is correct because I have compared it to programs that have been verified by the industry, namely AGI32. I have tried to parallelize the code by making the first loop a 'parfor' but I did not see a decent of improvement in how long it takes to run.
Irr = zeros(length(rows),length(columns),length(xFixtureLocations));
for i1 = 1:length(rows)
for i2 = 1:length(columns)
for i3 = 1:length(xFixtureLocations)
x = rows(i1)-xFixtureLocations(i3);
y = columns(i2)-yFixtureLocations(i3);
r = sqrt(x^2 + y^2);
thetaPt = atan(r/MountHeight);
if x==0
phiPt = 0;
else
phiPt = atan2(y,x);
end
phiPt = phiPt+pi + orientation(i3);
phiPt = mod(phiPt,2*pi)-pi;
dsq = r^2+(MountHeight)^2;
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
phiPt*180/pi, thetaPt*180/pi,'*nearest', 0.);
Irr(i1,i2,i3) = round((Ipt*cos(thetaPt)/dsq)*(Multiplier/1000),3);
end
end
end
Irr = sum(Irr,3);
Avg = mean2(Irr);
Max = max(max(Irr));
Min = min(min(Irr));

#### 1 Comment

dpb on 1 Jun 2017
This would likely be easier to visualize with a (small) sample dataset to accompany it and I don't have time at the moment to pore over it in enough detail to really work out the details, but...
It strikes me you could generate the [X,Y] location arrays for the two sets of points (calculational and fixtures, respectively) and then use pdist2 to return the distance and then just do the calculation on that result. Seems as though could basically remove all the looping...
As said, a small sample case and code that could run in its entirety would surely help folks to do something without having to develop a baseline from which to start as well...

Kelly Kearney on 1 Jun 2017
You're calling interp2 800 times with a single point each. Just moving that outside the loop should be able to speed things up considerably:
[phiPtall, thetaPtall] = deal(zeros(length(rows), length(columns), length(xFixtureLocations)));
for i1 = 1:length(rows)
for i2 = 1:length(columns)
for i3 = 1:length(xFixtureLocations)
x = rows(i1)-xFixtureLocations(i3);
y = columns(i2)-yFixtureLocations(i3);
r = sqrt(x^2 + y^2);
thetaPt = atan(r/MountHeight);
if x==0
phiPt = 0;
else
phiPt = atan2(y,x);
end
phiPt = phiPt+pi + orientation(i3);
phiPt = mod(phiPt,2*pi)-pi;
dsq = r^2+(MountHeight)^2;
phiPtall(i1,i2,i3) = phiPt*180/pi;
thetaPtall(i1,i2,i3) = thetaPt*180/pi;
end
end
end
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
phiPtall, thetaPtall,'*nearest', 0.);
Irr = round((Ipt.*cos(thetaPtall)./dsq).*(Multiplier./1000),3);
Irr = sum(Irr,3);
Avg = mean2(Irr);
Max = max(max(Irr));
Min = min(min(Irr));
As dpb mentioned, the calculations of the angles also look like they can be vectorized pretty well, but that would be more easily tested with some sample data.

Steven Lord on 5 Jun 2017
To compute the cosine of an angle in degrees, consider using cosd instead of converting using deg2rad yourself.
Timothy Plummer on 12 Jun 2017
So after iterating through this solution a lot, and expanding out the size of the x and y locations by almost eight-fold, I am back at the program taking more than 2 hours to full program to compute. If there is anything that can be done to make this loop even better that would be great.
I have attached some data that is generated before one attempted calculation and the results of that calculation.
Kelly Kearney on 14 Jun 2017
I think your best bet would be to do away with the loop altogether. That may depend on the size of your data, though... the one-loop solution may make more sense if you're dealing with large arrays. I tested your 3-loop option against a fully vectorized option and a 1-loop option:
rows = A.rows;
columns = A.columns;
xFixtureLocations = A.xFixtureLocations;
yFixtureLocations = A.yFixtureLocations;
IESdata = A.IESdata;
% Some counters
nr = length(rows);
nc = length(columns);
nfix = length(xFixtureLocations);
[m,n] = size(IESdata.photoTable);
orientation = zeros(1, nfix);
Multiplier = 1;
MountHeight = 2;
% 3 loops
tic;
for ii = 1:500
phiPt = zeros(nr,nc,nfix);
thetaPt = zeros(nr,nc,nfix);
dsq = zeros(nr,nc,nfix);
for i1 = 1:length(rows)
for i2 = 1:length(columns)
for i3 = 1:length(xFixtureLocations)
x = rows(i1)-xFixtureLocations(i3);
y = columns(i2)-yFixtureLocations(i3);
r = sqrt(x^2 + y^2);
thetaPt(i1,i2,i3) = atan(r/MountHeight);
if x==0
phiPt(i1,i2,i3) = 0;
else
phiPt(i1,i2,i3) = atan2(y,x);
end
phiPt(i1,i2,i3) = phiPt(i1,i2,i3) + pi + orientation(i3);
phiPt(i1,i2,i3) = mod(phiPt(i1,i2,i3),2*pi)-pi;
dsq(i1,i2,i3) = r^2+(MountHeight)^2;
end
end
end
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
Irr{1} = round((Ipt.*cos(thetaPt)./dsq).*(Multiplier./1000),3);
Out(1) = struct('phiPt', phiPt, 'thetaPt', thetaPt, 'dsq', dsq, 'Irr', Irr);
end
t(1) = toc;
% No loops
tic
for ii = 1:500
[xg, yg] = ndgrid(rows, columns);
x = bsxfun(@minus, xg, permute(xFixtureLocations, [1 3 2]));
y = bsxfun(@minus, yg, permute(yFixtureLocations, [1 3 2]));
r = sqrt(x.^2 + y.^2);
thetaPt = atan(r./MountHeight);
phiPt = zeros(size(y));
phiPt(x~=0) = atan2(y(x~=0), x(x~=0));
phiPt = bsxfun(@plus, phiPt + pi, permute(orientation, [1 3 2]));
phiPt = mod(phiPt, 2*pi) - pi;
dsq = r.^2 + MountHeight.^2;
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
Irr = round((Ipt.*cos(thetaPt)./dsq).*(Multiplier./1000),3);
Out(2) = struct('phiPt', phiPt, 'thetaPt', thetaPt, 'dsq', dsq, 'Irr', Irr);
end
t(2) = toc;
% One loop
for ii = 1:500
[xg, yg] = ndgrid(rows, columns);
phiPt = zeros(nr,nc,nfix);
thetaPt = zeros(nr,nc,nfix);
dsq = zeros(nr,nc,nfix);
for ifix = 1:nfix
x = xg - xFixtureLocations(ifix);
y = yg - yFixtureLocations(ifix);
r = sqrt(x.^2 + y.^2);
thetaPt(:,:,ifix) = atan(r./MountHeight);
tmp = zeros(size(y));
tmp(x~=0) = atan2(y(x~=0),x(x~=0));
phiPt(:,:,ifix) = tmp;
phiPt(:,:,ifix) = phiPt(:,:,ifix) + pi + orientation(ifix);
dsq(:,:,ifix) = r.^2 + MountHeight.^2;
end
phiPt = mod(phiPt, 2*pi) - pi;
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
Irr = round((Ipt.*cos(thetaPt)./dsq).*(Multiplier./1000),3);
Out(3) = struct('phiPt', phiPt, 'thetaPt', thetaPt, 'dsq', dsq, 'Irr', Irr);
end
t(3) = toc;
fprintf('3 loops: %f\n0 loops: %f\n1 loop : %f\n', t);
(The 500-times loop is just for timing purposes).
These were the results on my computer:
3 loops: 0.514277
0 loops: 0.351616
1 loop : 0.852183