I am brand-spanking new to vectorizing code. Forgive my
ignorance.
N indicates number of nodes.
M indicates number of simulations.
K = an N x N matrix of pre-calculated between-node
interaction values for this application.
I have two N x M logical matrices, S and I, where a 1 in
position (n,m) indicates that in simulation M, node n is
active in set S or I as appropriate. The goal is to
generate an N x M matrix in which element (n,m) is the sum
of effect of all active nodes in I on the n'th node in S,
for simulation M.
my code is like:
totalfactor = zeros(N,M);
for simnumber = 1:M
totalfactor(:,simnumber) = sum( K( I(:, simnumber), S(:,
simnumber), 1);
end
I KNOW there is a more elegant (and faster!) way to do this,
but I am just learning this vectorization and my brain is
having a hard time switching from the looping I'm used to.
Thank you for your help! I'm not used to feeling so ignorant!
"Sky Pelletier" <skytoddk@remove14chars.vet.upenn.edu>
wrote in message <g000e3$eoi$1@fred.mathworks.com>...
> I am brand-spanking new to vectorizing code. Forgive my
> ignorance.
>
> N indicates number of nodes.
> M indicates number of simulations.
>
> K = an N x N matrix of pre-calculated between-node
> interaction values for this application.
>
> I have two N x M logical matrices, S and I, where a 1 in
> position (n,m) indicates that in simulation M, node n is
> active in set S or I as appropriate. The goal is to
> generate an N x M matrix in which element (n,m) is the sum
> of effect of all active nodes in I on the n'th node in S,
> for simulation M.
>
> my code is like:
>
> totalfactor = zeros(N,M);
>
> for simnumber = 1:M
> totalfactor(:,simnumber) = sum( K( I(:, simnumber), S(:,
> simnumber), 1);
> end
>
> I KNOW there is a more elegant (and faster!) way to do
this,
> but I am just learning this vectorization and my brain is
> having a hard time switching from the looping I'm used to.
>
> Thank you for your help! I'm not used to feeling so
ignorant!
I suspect that using your FOR-loop may in fact be the
fastest and most elegant (and most memory efficient) way of
doing this.
Thank you for your quick feedback! I greatly appreciate it.
It seems like I should be able to find a clever way to index
directly at the cost of using more memory, but I just can't
wrap my head around it. I don't mind too much trading off
memory for speed; this loop occurs inside an application
that ultimately will be run several billion times using
different input parameters, so speed is a priority--but you
think this might actually be the fastest way?
Is there any way of a priori identifying when vectorization
is NOT a good idea?
Thanks again for your help...
..Sky
"helper " <spamless@nospam.com> wrote in message
<g01as9$pf6$1@fred.mathworks.com>...
>
> I suspect that using your FOR-loop may in fact be the
> fastest and most elegant (and most memory efficient) way of
> doing this.
>
> This can often be the case with MATLAB.
"Sky Pelletier" <skytoddk@remove14chars.vet.upenn.edu>
wrote in message <g01rcb$28h$1@fred.mathworks.com>...
> Thank you for your quick feedback! I greatly appreciate
it.
>
> It seems like I should be able to find a clever way to
index
> directly at the cost of using more memory, but I just
can't
> wrap my head around it. I don't mind too much trading off
> memory for speed; this loop occurs inside an application
> that ultimately will be run several billion times using
> different input parameters, so speed is a priority--but
you
> think this might actually be the fastest way?
>
> Is there any way of a priori identifying when
vectorization
> is NOT a good idea?
>
> Thanks again for your help...
>
> ..Sky
>
There is no definite a priori method of identifying when
vectorization will be better than FOR-loops. However,
because of the speed of modern day processors, memory
allocation can often be the primary source of performance
problems (not arithmetic operations). Therefore, if you
have to allocate a lot of new memory just to vectorize your
code, then it is generally not worth it.
To help you wrap your head around what you are doing and
how I vectorized it:
The portion of your code:
K( I(:, simnumber), S(:,simnumber))
which, if we define:
I = [
2 5
4 6
6 7];
S = [
1 7
3 8
5 9];
For simnumber=1,
K([2;4;6], [1;3;5])
returns a submatrix of the 2,4,6 rows and the 1,3,5
columns. Then, the command:
sum(K([2;4;6], [1;3;5]), 1) % simnumber=1
gives you the sum of of the 2,4,6th elements in each of the
1,3,5 columns. This syntax works only because for each of
the 1,3,5 columns, you want the same rows.
However, if we consider another another submatrix of data:
sum(K([7;8;9], [5;6;7]),1) % simnumber=2
and we want to perform this operation for both simnumber=1
and sumnumber=2 with a single subscripting operation, we
are out of luck. Subscripting only returns submatrices of
data. The simnumber=1 command requests a submatrix, and
the simnumber=2 command requests a submatrix, but the two
combined are no longer a submatrix.
we get a equivalent vectorized operation. We can even
combine all of this into one behemoth of a line of code.
Note, however, that this is less elegant and if you test it
with tic/toc, you will see it is MUCH slower than your FOR-
loop.
This is not the same method I gave at the start. The first
set of code I gave is an attempt to avoid having to using
REPMAT, and instead using BSXFUN, however we still do not
compare in performance to your FOR-loop.
Something useful to keep in mind:
Subscripting "K(row,col)" only returns submatrices of K.
If you want a nonsubmatrix (scattered elements), you must
use indexing "K(index)".
"Sky Pelletier" <skytoddk@remove14chars.vet.upenn.edu> wrote in message
<g000e3$eoi$1@fred.mathworks.com>...
> I am brand-spanking new to vectorizing code. Forgive my
> ignorance.
>
> N indicates number of nodes.
> M indicates number of simulations.
>
> K = an N x N matrix of pre-calculated between-node
> interaction values for this application.
>
> I have two N x M logical matrices, S and I, where a 1 in
> position (n,m) indicates that in simulation M, node n is
> active in set S or I as appropriate. The goal is to
> generate an N x M matrix in which element (n,m) is the sum
> of effect of all active nodes in I on the n'th node in S,
> for simulation M.
>
> my code is like:
>
> totalfactor = zeros(N,M);
>
> for simnumber = 1:M
> totalfactor(:,simnumber) = sum( K( I(:, simnumber), S(:,
> simnumber), 1);
> end
>
> I KNOW there is a more elegant (and faster!) way to do this,
> but I am just learning this vectorization and my brain is
> having a hard time switching from the looping I'm used to.
>
> Thank you for your help! I'm not used to feeling so ignorant!
-----------
In my version of matlab I couldn't get your for-loop to work properly. I had
to write it as:
T = zeros(N,M); % (totalfactor)
for m = 1:
T(S(:,m),m) = sum(K(I(:,m),S(:,m)),1)';
end
and even then there were cases that still gave it trouble. The problem is that
the right side vector in general has fewer elements than the left side in the
form you defined, and matlab, at least mine, doesn't know which elements go
where.
In any event, when you say, "The goal is to generate an N x M matrix in
which element (n,m) is the sum of effect of all active nodes in I on the n'th
node in S, for simulation m", I assume that you meant if S(n,m) is false (zero,)
then zero goes into the matrix at (n,m), regardless of what sum from K is
obtained. If so, the following is a possible vectorization of this, though I am
not sure it is any faster than your for-loop method.
totalfactor = S.*(K'*I);
The summation is done by the matrix product. This uses your logical arrays
as numerical arrays with 1's and 0's, so it is possible you would need to
convert them explicitly to numerical type to work properly, say with '+'. I
don't know. My own system doesn't know about logical types and so has no
trouble with that.
"Roger Stafford"
<ellieandrogerxyzzy@mindspring.com.invalid> wrote in
message <g053lr$rss$1@fred.mathworks.com>...
> "Sky Pelletier" <skytoddk@remove14chars.vet.upenn.edu>
wrote in message
> <g000e3$eoi$1@fred.mathworks.com>...
> > I am brand-spanking new to vectorizing code. Forgive my
> > ignorance.
> >
> > N indicates number of nodes.
> > M indicates number of simulations.
> >
> > K = an N x N matrix of pre-calculated between-node
> > interaction values for this application.
> >
> > I have two N x M logical matrices, S and I, where a 1 in
> > position (n,m) indicates that in simulation M, node n is
> > active in set S or I as appropriate. The goal is to
> > generate an N x M matrix in which element (n,m) is the
sum
> > of effect of all active nodes in I on the n'th node in
S,
> > for simulation M.
> >
> > my code is like:
> >
> > totalfactor = zeros(N,M);
> >
> > for simnumber = 1:M
> > totalfactor(:,simnumber) = sum( K( I(:, simnumber), S
(:,
> > simnumber), 1);
> > end
> >
> > I KNOW there is a more elegant (and faster!) way to do
this,
> > but I am just learning this vectorization and my brain
is
> > having a hard time switching from the looping I'm used
to.
> >
> > Thank you for your help! I'm not used to feeling so
ignorant!
> -----------
> In my version of matlab I couldn't get your for-loop to
work properly. I had
> to write it as:
>
> T = zeros(N,M); % (totalfactor)
> for m = 1:
> T(S(:,m),m) = sum(K(I(:,m),S(:,m)),1)';
> end
>
> and even then there were cases that still gave it
trouble. The problem is that
> the right side vector in general has fewer elements than
the left side in the
> form you defined, and matlab, at least mine, doesn't know
which elements go
> where.
>
> In any event, when you say, "The goal is to generate an
N x M matrix in
> which element (n,m) is the sum of effect of all active
nodes in I on the n'th
> node in S, for simulation m", I assume that you meant if S
(n,m) is false (zero,)
> then zero goes into the matrix at (n,m), regardless of
what sum from K is
> obtained. If so, the following is a possible
vectorization of this, though I am
> not sure it is any faster than your for-loop method.
>
> totalfactor = S.*(K'*I);
>
> The summation is done by the matrix product. This uses
your logical arrays
> as numerical arrays with 1's and 0's, so it is possible
you would need to
> convert them explicitly to numerical type to work
properly, say with '+'. I
> don't know. My own system doesn't know about logical
types and so has no
> trouble with that.
>
> Roger Stafford
>
I must admit, I didn't quite follow her paragraph
explanation. So I chose to ignore the paragraph and just
vectorize the code.
Now that I reread, and see that S and I are logicals and
not matrices of indices (as I gave in my example) I
recognize the same issues as Roger.
His method is better, and there will be no need to convert
to double since the multiplication between logical and
double matrices will automatically convert to double.
is exactly what I was looking for! Thank you! I was
suffering from tunnel vision; having indexed into the matrix
K in an earlier version, I was focusing on how to use the
logical matrices S and I to index K, when in fact my
original idea was to use them as multiplicative masks--I
even call them S_mask and I_mask in my code! An
intermediate version had S and I as index vectors, and I
was trying so hard to figure out how to adjust that version
while still using them as indices that I missed the obvious.
and by the way, I think my loop actually doesn't work
because of how I misunderstood the matrix versus linear
indexing issue. In any event, it wasn't giving the correct
answers when I left work Friday :) I believe with a few
appropriate tweaks I should be able to get it working so I
can test which is actually faster, though I strongly suspect
the method you proposed will prove to be the winner.
Thank you so much. I owe you a coke.
..Sky
"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid>
wrote in message <g053lr$rss$1@fred.mathworks.com>...
> In any event, when you say, "The goal is to generate an
N x M matrix in
> which element (n,m) is the sum of effect of all active
nodes in I on the n'th
> node in S, for simulation m", I assume that you meant if
S(n,m) is false (zero,)
> then zero goes into the matrix at (n,m), regardless of
what sum from K is
> obtained. If so, the following is a possible
vectorization of this, though I am
> not sure it is any faster than your for-loop method.
>
> totalfactor = S.*(K'*I);
>
> The summation is done by the matrix product. This uses
your logical arrays
> as numerical arrays with 1's and 0's, so it is possible
you would need to
> convert them explicitly to numerical type to work
properly, say with '+'. I
> don't know. My own system doesn't know about logical
types and so has no
> trouble with that.
>
> Roger Stafford
>
"Sky Pelletier" <skytoddk@remove14chars.vet.upenn.edu> wrote in message
<g0628u$qdp$1@fred.mathworks.com>...
> Roger,
>
> totalfactor = S.*(K'*I);
>
> is exactly what I was looking for! Thank you! I was
> suffering from tunnel vision; having indexed into the matrix
> K in an earlier version, I was focusing on how to use the
> logical matrices S and I to index K, when in fact my
> original idea was to use them as multiplicative masks--I
> even call them S_mask and I_mask in my code! An
> intermediate version had S and I as index vectors, and I
> was trying so hard to figure out how to adjust that version
> while still using them as indices that I missed the obvious.
>
> and by the way, I think my loop actually doesn't work
> because of how I misunderstood the matrix versus linear
> indexing issue. In any event, it wasn't giving the correct
> answers when I left work Friday :) I believe with a few
> appropriate tweaks I should be able to get it working so I
> can test which is actually faster, though I strongly suspect
> the method you proposed will prove to be the winner.
>
> Thank you so much. I owe you a coke.
>
> ..Sky
------------
Hello Sky Pelletier,
You are entirely welcome. To tell the truth I didn't find that solution right
away. I was trying for a for-loop along the n=1:N direction, just in case your
N was appreciably smaller than your M, but when I laid it out that way, the
total vectorization method just fell into my lap, so to speak. It was begging
to be used, and I don't know why I hadn't seen it earlier.
Public Submission Policy
NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for
all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content.
Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available
via MATLAB Central. Read the complete Disclaimer prior to use.