From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Vectorizing code
Date: Mon, 25 Oct 2010 14:14:04 +0000 (UTC)
Organization: The MathWorks, Inc.
Lines: 61
Message-ID: <ia43bc$9d5$>
References: <ia3f8a$od2$>
Reply-To: <HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: 1288016044 9637 (25 Oct 2010 14:14:04 GMT)
NNTP-Posting-Date: Mon, 25 Oct 2010 14:14:04 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1187260
Xref: comp.soft-sys.matlab:681114

"Sven " <> wrote in message <ia3f8a$od2$>...
> Dear all,
> I am struggling a bit with a piece of my code. This actual piece seems to be the bottleneck in my code (according to profiler). This is mostly due to the for/if loops.
> I was wondering if it is possible to vectorize some part (or completely) of it.
> The code:
> for i=1:ngrids-1
>         sumk = zeros(ngrids,1);
>         for k = 1:i
>             Vdif = Vhalf(i+1)-V(k);
>             if Vdif < Vhalf(2)
>                 Vdif = Vhalf(2);
>             end
>             for j=2:length(Vhalf)
>                 if Vdif <= Vhalf(j) && Vdif > Vhalf(j-1)
>                 alpha=j;
>                 end
>             end
>             int1 = zeros(ngrids,1);
>             for j = alpha:ngrids
>                 int1(j)=f(j)/V(i)*(Vhalf(j+1)-Vhalf(j))*agglom((i*2),(k*2));
>             end
>                 int2=f(alpha-1)/(Vhalf(alpha)+Vhalf(i+1)-V(k))*(Vhalf(alpha)-Vhalf(i+1)-V(k))*agglom((((2*alpha)-1)+((2*i)+1)-(k*2))/2);
>                 sumk (k) = (sum(int1)+int2)*(Vhalf(k+1)-Vhalf(k))*f(k);
>         end
>         Fphalf (i) = sum(sumk);
> end
> If anyone of you could give me some pointers it would be much appreciated.
- - - - - - - - - - -
    The two innermost for-loops involving j are presumably crucial to the efficiency of your algorithm.  I see a few ways these could be improved.

  As for the first of these, the way it is written you are seeking the last index j for which the given inequalities hold.  You should therefore be going backward with j starting from j = length(Vhalf) and moving downwards towards j = 2 and seeking the first j for which the inequalities hold and then doing a 'break'.  You can economize further by seeking (while still moving backwards) the first j for which Vdif <= Vhalf(j) and breaking.  Then continuing with that j, seek the first j for which Vdif > Vhalf(j-1) and again breaking.  You will have arrived at alpha.  This should cut down the expected length of array you have to cover and the number of inequalities that need calculating while doing so.

  The second j-for-loop can be greatly improved.  In the subsequent code you never use int1 for any other purpose than taking its sum.  To avoid adding up the same quantities in this for-loop over and over again, you can establish a cumulative sum of the quantities f(j)*(Vhalf(j+1)-Vhalf(j)) but working backwards from ngrids.  That is,

 c = -cumsum(f(ngrids:-1:2).*diff(Vhalf(ngrids+1:-1:2));

can be computed just once before ever entering into any of the for-loops.  Then instead of this second for-loop you would just write

 sumint1 = c(ngrids+1-alpha)*agglom((i*2),(k*2))/V(i);
which you would use in place of the sum(int1) term in computing sumk(k).  There would be no need to allocate space for an int1 array.

  A third improvement might be to avoid the repeated sumk allocation.  Just keep a running sum of sumk which you can place in Fphalf(i) when you exit from the "for k = 1:i" loop.

  Your calculation of Vdif can be simplified to

 Vdif = max(Vhalf(i+1)-V(k),Vhalf2);

though this may be only a saving in the number of lines of code.

  Anyway this is a starter for you.  Perhaps you will be inspired to find many other ways of improving this code.  Note that it isn't necessarily the for-loops that are the source of the excessive computation time but what you do within these loops.  Vectorization per se does not necessarily save computation time.  It is the algorithm that counts.

Roger Stafford