Path: news.mathworks.com!not-for-mail
From: "helper " <spamless@nospam.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: newbie seeks vectorization help
Date: Fri, 9 May 2008 20:09:03 +0000 (UTC)
Organization: Timothy S. Farajian, Inc.
Lines: 176
Message-ID: <g02b0v$o9v$1@fred.mathworks.com>
References: <g000e3$eoi$1@fred.mathworks.com> <g01as9$pf6$1@fred.mathworks.com> <g01rcb$28h$1@fred.mathworks.com>
Reply-To: "helper " <spamless@nospam.com>
NNTP-Posting-Host: webapp-05-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1210363743 24895 172.30.248.35 (9 May 2008 20:09:03 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Fri, 9 May 2008 20:09:03 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1272923
Xref: news.mathworks.com comp.soft-sys.matlab:467637


"Sky Pelletier" <skytoddk@remove14chars.vet.upenn.edu> 
wrote in message <g01rcb$28h$1@fred.mathworks.com>...
> Thank you for your quick feedback!  I greatly appreciate 
it.
> 
> It seems like I should be able to find a clever way to 
index
> directly at the cost of using more memory, but I just 
can't
> wrap my head around it.  I don't mind too much trading off
> memory for speed; this loop occurs inside an application
> that ultimately will be run several billion times using
> different input parameters, so speed is a priority--but 
you
> think this might actually be the fastest way?  
> 
> Is there any way of a priori identifying when 
vectorization
> is NOT a good idea?
> 
> Thanks again for your help...
> 
> ..Sky
> 

There is no definite a priori method of identifying when 
vectorization will be better than FOR-loops.  However, 
because of the speed of modern day processors, memory 
allocation can often be the primary source of performance 
problems (not arithmetic operations).  Therefore, if you 
have to allocate a lot of new memory just to vectorize your 
code, then it is generally not worth it.

Ok, you asked for it:

totalFactor = reshape(sum(bsxfun(@(X,Y)K(X,Y),kron(I,ones
(1,N)),S(:).'),1),N,M);

Much slower and MUCH less elegant.

To help you wrap your head around what you are doing and 
how I vectorized it:

The portion of your code:

 K( I(:, simnumber), S(:,simnumber))

which, if we define:

I = [
 2 5
 4 6
 6 7];

S = [
 1 7
 3 8
 5 9];

For simnumber=1, 

 K([2;4;6], [1;3;5])

returns a submatrix of the 2,4,6 rows and the 1,3,5 
columns.  Then, the command:

 sum(K([2;4;6], [1;3;5]), 1) % simnumber=1

gives you the sum of of the 2,4,6th elements in each of the 
1,3,5 columns.  This syntax works only because for each of 
the 1,3,5 columns, you want the same rows.

However, if we consider another another submatrix of data:

  sum(K([7;8;9], [5;6;7]),1) % simnumber=2

and we want to perform this operation for both simnumber=1 
and sumnumber=2 with a single subscripting operation, we 
are out of luck.  Subscripting only returns submatrices of 
data.  The simnumber=1 command requests a submatrix, and 
the simnumber=2 command requests a submatrix, but the two 
combined are no longer a submatrix. 

So now what you want are the following sums:

row(I2)  column(S2)  
 2 4 6      1
 2 4 6      3
 2 4 6      5
 7 8 9      5
 7 8 9      6
 7 8 9      7

These matrices are created using:

I2 = kron(I,ones(1,3)).'
S2 = S(:)

One method for obtaining this scattering (non submatrix) of 
data would be to use indexing.  First, repeat the set of 
column data:

   subscripts
row(I2)    column(S2)
 2 4 6      1 1 1
 2 4 6      3 3 3
 2 4 6      5 5 5
 7 8 9      5 5 5
 7 8 9      6 6 6
 7 8 9      7 7 7

S2 = repmat(S2,1,3)

Then, convert each row,column pair to an index using 
SUB2IND:

  index = sub2ind(size(K), I2, S2);

Assuming K is 10-by-10, we get

        index
     2     4     6
    22    24    26
    42    44    46
    65    66    67
    75    76    77
    85    86    87


we can now index into K using:

 vals = K(index)

and the equivalent summing command would be

 sums = sum(K(index), 2)

To now RESHAPE these sums back to a matrix where they 
correspond to what your original command:

  totalfactor(:,simnumber) = sum( K( I(:, simnumber), S
(:,simnumber)), 1);

gave, we use:

  totalfactor = reshape(sums,3, 2);

Summarizing the code and substituting N=3, M=2:

I2 = kron(I,ones(1,N)).';
S2 = S(:);
S2 = repmat(S2,1,N);
index = sub2ind(size(K), I2, S2);
sums = sum(K(index), 2)
totalfactor = reshape(sums,N,M);

we get a equivalent vectorized operation.  We can even 
combine all of this into one behemoth of a line of code.  
Note, however, that this is less elegant and if you test it 
with tic/toc, you will see it is MUCH slower than your FOR-
loop.

This is not the same method I gave at the start.  The first 
set of code I gave is an attempt to avoid having to using 
REPMAT, and instead using BSXFUN, however we still do not 
compare in performance to your FOR-loop.

Something useful to keep in mind:

Subscripting "K(row,col)" only returns submatrices of K.  
If you want a nonsubmatrix (scattered elements), you must 
use indexing "K(index)".

I hope this helps.