Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Optimizing an algorithm

Subject: Optimizing an algorithm

From: lesodk Zokla

Date: 4 Feb, 2009 20:50:18

Message: 1 of 16

Sorry guys, this is my first day on the forum and by accident i made multiple posts. Now here is the real problem:

I have an algorithm which handles really big matrices. One of the things in the loop is this:

...
tmp = D*C' ;
H = C*tmp;
...

The reason why i'm using the tmp variable, is that it is used several other places in the algorithm. C is a matrix, without any specific structure beacuse it is an input to the algorithm. D is always diagonal, and furthermore i have a vector "d" which is holding all the diagonal elements, in fact i am creating D by using diag(d). Does anyone know how to make these lines more effecient. Maybe i don't even need to define D and can make the calcualtion from the vector d, with the diagonal elements? i don't know.

Thanks alot!

I tried calculating tmp by tmp = bsxfun(@times,D,C'), but that slows down the algorithm.

Subject: Optimizing an algorithm

From: John D'Errico

Date: 4 Feb, 2009 21:21:01

Message: 2 of 16

"lesodk Zokla" <lesodk@gmail.com> wrote in message <gmcv2a$m3u$1@fred.mathworks.com>...
> Sorry guys, this is my first day on the forum and by accident i made multiple posts. Now here is the real problem:
>
> I have an algorithm which handles really big matrices. One of the things in the loop is this:
>
> ...
> tmp = D*C' ;
> H = C*tmp;
> ...
>
> The reason why i'm using the tmp variable, is that it is used several other places in the algorithm. C is a matrix, without any specific structure beacuse it is an input to the algorithm. D is always diagonal, and furthermore i have a vector "d" which is holding all the diagonal elements, in fact i am creating D by using diag(d). Does anyone know how to make these lines more effecient. Maybe i don't even need to define D and can make the calcualtion from the vector d, with the diagonal elements? i don't know.
>
> Thanks alot!
>
> I tried calculating tmp by tmp = bsxfun(@times,D,C'), but that slows down the algorithm.

Really? Why do I doubt your statement?

d = rand(1000,1);
D = diag(d);
C = rand(1000);

timeit(@() bsxfun(@times,d,C'))
ans =
     0.092805

timeit(@() D*C')
ans =
       1.0038

Perhaps you misused bsxfun, misunderstanding
what it does and why it works. But I know that
you used it improperly if you found it to slow
down your code.

John

Subject: Optimizing an algorithm

From: Matt Fig

Date: 4 Feb, 2009 21:40:19

Message: 3 of 16

I have to agree with John here. Even a one-off timing reveals the bsxfun as nearly 22 times faster.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
clear functions
clear all

d = rand(1000,1);
D = diag(d);
C = rand(1000);

tic
FF = bsxfun(@times,d,C');
toc

tic
F = D*C';
toc

all(FF(:)==F(:))
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Elapsed time is 0.016388 seconds.
Elapsed time is 0.360479 seconds.

ans =

     1



Have you tried profiling your code? Where is it slow? How large are these matrices?




LCOEOAGEYV_m M_BNEfPNK9AEA_P_II-fTO_H_CLTOU_HMHOOyAU_AA%ABT

Subject: Optimizing an algorithm

From: lesodk Zokla

Date: 4 Feb, 2009 21:46:02

Message: 4 of 16

"John D'Errico" <woodchips@rochester.rr.com> wrote in message <gmd0rt$mv9$1@fred.mathworks.com>...
> "lesodk Zokla" <lesodk@gmail.com> wrote in message <gmcv2a$m3u$1@fred.mathworks.com>...
> > Sorry guys, this is my first day on the forum and by accident i made multiple posts. Now here is the real problem:
> >
> > I have an algorithm which handles really big matrices. One of the things in the loop is this:
> >
> > ...
> > tmp = D*C' ;
> > H = C*tmp;
> > ...
> >
> > The reason why i'm using the tmp variable, is that it is used several other places in the algorithm. C is a matrix, without any specific structure beacuse it is an input to the algorithm. D is always diagonal, and furthermore i have a vector "d" which is holding all the diagonal elements, in fact i am creating D by using diag(d). Does anyone know how to make these lines more effecient. Maybe i don't even need to define D and can make the calcualtion from the vector d, with the diagonal elements? i don't know.
> >
> > Thanks alot!
> >
> > I tried calculating tmp by tmp = bsxfun(@times,D,C'), but that slows down the algorithm.
>
> Really? Why do I doubt your statement?
>
> d = rand(1000,1);
> D = diag(d);
> C = rand(1000);
>
> timeit(@() bsxfun(@times,d,C'))
> ans =
> 0.092805
>
> timeit(@() D*C')
> ans =
> 1.0038
>
> Perhaps you misused bsxfun, misunderstanding
> what it does and why it works. But I know that
> you used it improperly if you found it to slow
> down your code.
>
> John

Could i be beacuse i'm using sparse matrices?

All my matrices is really sparse. Sorry for beign stupid and not mention that.

Subject: Optimizing an algorithm

From: Matt

Date: 4 Feb, 2009 21:49:02

Message: 5 of 16


> "lesodk Zokla" <lesodk@gmail.com> wrote in message <gmcv2a$m3u$1@fred.mathworks.com>...

> > I tried calculating tmp by tmp = bsxfun(@times,D,C'), but that slows down the algorithm.

You have a typo. The expression you want is

tmp = bsxfun(@times,d,C'); %d instead of D

Subject: Optimizing an algorithm

From: Matt Fig

Date: 4 Feb, 2009 21:53:01

Message: 6 of 16

At the command line:

>>profile on
>>myfunc % <-run your code, whatever it is called.
>>profile viewer





0u[k[cWejue6e|beXeO^W[k_W]$uuac[oWfjdY;^|dbuluY^C_ueuWWWjfX

Subject: Optimizing an algorithm

From: Matt

Date: 4 Feb, 2009 21:56:02

Message: 7 of 16

"lesodk Zokla" <lesodk@gmail.com> wrote in message <gmcv2a$m3u$1@fred.mathworks.com>...
> Sorry guys, this is my first day on the forum and by accident i made multiple posts. Now here is the real problem:
>
> I have an algorithm which handles really big matrices. One of the things in the loop is this:
>
> ...
> tmp = D*C' ;
> H = C*tmp;
> ...
>
> The reason why i'm using the tmp variable, is that it is used several other places in the algorithm. C is a matrix, without any specific structure beacuse it is an input to the algorithm.

If C is one of the really big matrix and if it is a fixed input to the function, you should also be pre-computing its conj. tranpose outside the loop, i.e.

Ct=C';

and then in your loop

tmp=bsxfun(@times,d,Ct); %instead of tmp=D*C';

Subject: Optimizing an algorithm

From: Matt

Date: 4 Feb, 2009 22:08:02

Message: 8 of 16

"lesodk Zokla" <lesodk@gmail.com> wrote in message <gmd2aq$9uh$1@fred.mathworks.com>...
>
> Could i be beacuse i'm using sparse matrices?

Yes, that could definitely be the reason. Based on my tests below, your original solution would be fastest even with moderately sparse matrices. However, it would still be wise to pre-compute C' as I suggested in an earlier post.

 d = rand(1000,1);
 D = sparse(diag(d));
 C = sprand(1000,1000,0.5);
 Ct=C;
 
 tic
  tmp=D*Ct;
 toc
 %Elapsed time is 0.014170 seconds.
 
 tic
  tmp=(C*D)';
 toc
 %Elapsed time is 0.049360 seconds.
 
 tic
  tmp=bsxfun(@times,d,Ct);
 toc
 %Elapsed time is 0.082769 seconds.
 
  tic
  tmp=bsxfun(@times,C,d)';
  toc
  %Elapsed time is 0.091029 seconds.

Subject: Optimizing an algorithm

From: John D'Errico

Date: 4 Feb, 2009 22:15:04

Message: 9 of 16

"lesodk Zokla" <lesodk@gmail.com> wrote in message <gmd2aq$9uh$1@fred.mathworks.com>...
>
> Could i be beacuse i'm using sparse matrices?
>
> All my matrices is really sparse. Sorry for beign stupid and not mention that.

Geez. You think after all this time, you might have
mentioned that sooner?

Is your diagonal matrix a sparse one?

If so, then you are already probably getting the
most speed possible in matlab. If it is not sparse,
created explicitly as such, then make it so.

John

Subject: Optimizing an algorithm

From: Matt Fig

Date: 4 Feb, 2009 22:30:18

Message: 10 of 16

"Matt " <xys@whatever.com> wrote in message


After fixing your typo "Ct=C;" I tried your code. I got an error that says bsxfun must have full arguments, which make sense since it runs through loops. So I put in the appropriate line to make the args full. These are my timings.


Elapsed time is 0.028574 seconds.
Elapsed time is 0.046560 seconds.
Elapsed time is 0.041958 seconds.
Elapsed time is 0.053943 seconds.


So in your version, I wonder if the fulling is done inside bsxfun? Good call Matt.




MBEv\Lc"L\QM\|KD>>IKj\H>R6?EFLEJB*?>>\Q>L\BVIFcL\@BJ@RL>SQ\

Subject: Optimizing an algorithm

From: Matt

Date: 5 Feb, 2009 08:23:02

Message: 11 of 16

"Matt Fig" <spamanon@yahoo.com> wrote in message <gmd4tq$jg1$1@fred.mathworks.com>...
> "Matt " <xys@whatever.com> wrote in message
>
>
> After fixing your typo "Ct=C;" I tried your code. I got an error that says bsxfun must have full arguments, which make sense since it runs through loops. So I put in the appropriate line to make the args full. These are my timings.

Sorry, I don't get that error, even after fixing that typo. Perhaps bsxfun added support for sparse type in later versions?

In any case, I had another typo in the 4th test case and fixing that did make an interesting change in the timings. My new code and results are as below (I increased the matrix sizes to 3000 as well)

 n=3000;
 d = rand(n,1);
 D = sparse(diag(d));
 C = sprand(n,n,0.5);
 Ct=C';
 
 tic
  tmp=D*Ct;
 toc
 %Elapsed time is 0.147426 seconds.
 
 tic
  tmp=(C*D)';
 toc
%Elapsed time is 0.356921 seconds.
 
 tic
  tmp=bsxfun(@times,d,Ct);
 toc
 %Elapsed time is 0.504734 seconds.
 
  tic
  tmp=bsxfun(@times,C,d')'; %Formerly d' was mistyped as d
  toc
  %Elapsed time is 0.362502 seconds.


So, if the matrices were not sparse, then the second usage of bsxfun would be optimal.

I suppose the reason is obvious. When the scalar expension in bsxfun() is done row-wise instead of column-wise, the code needs to make less frequent memory accesses to the d' array.

 

Subject: Optimizing an algorithm

From: Matt

Date: 5 Feb, 2009 08:40:05

Message: 12 of 16

"Matt " <xys@whatever.com> wrote in message <gme7l6$me7$1@fred.mathworks.com>...

> I suppose the reason is obvious. When the scalar expension in bsxfun() is done row-wise instead of column-wise, the code needs to make less frequent memory accesses to the d' array.
>


Hmmm. Scratch that. I forgot to make the data non-sparse in my tests. The 3rd test case is optimal under fulls...

Subject: Optimizing an algorithm

From: Stefan

Date: 5 Feb, 2009 12:29:02

Message: 13 of 16

"lesodk Zokla" <lesodk@gmail.com> wrote in message <gmd2aq$9uh$1@fred.mathworks.com>...
> "John D'Errico" <woodchips@rochester.rr.com> wrote in message <gmd0rt$mv9$1@fred.mathworks.com>...
> > "lesodk Zokla" <lesodk@gmail.com> wrote in message <gmcv2a$m3u$1@fred.mathworks.com>...
> > > Sorry guys, this is my first day on the forum and by accident i made multiple posts. Now here is the real problem:
> > >
> > > I have an algorithm which handles really big matrices. One of the things in the loop is this:
> > >
> > > ...
> > > tmp = D*C' ;
> > > H = C*tmp;
> > > ...
> > >
> > > The reason why i'm using the tmp variable, is that it is used several other places in the algorithm. C is a matrix, without any specific structure beacuse it is an input to the algorithm. D is always diagonal, and furthermore i have a vector "d" which is holding all the diagonal elements, in fact i am creating D by using diag(d). Does anyone know how to make these lines more effecient. Maybe i don't even need to define D and can make the calcualtion from the vector d, with the diagonal elements? i don't know.
> > >
> > > Thanks alot!
> > >
> > > I tried calculating tmp by tmp = bsxfun(@times,D,C'), but that slows down the algorithm.
> >
> > Really? Why do I doubt your statement?
> >
> > d = rand(1000,1);
> > D = diag(d);
> > C = rand(1000);
> >
> > timeit(@() bsxfun(@times,d,C'))
> > ans =
> > 0.092805
> >
> > timeit(@() D*C')
> > ans =
> > 1.0038
> >
> > Perhaps you misused bsxfun, misunderstanding
> > what it does and why it works. But I know that
> > you used it improperly if you found it to slow
> > down your code.
> >
> > John
>
> Could i be beacuse i'm using sparse matrices?
>
> All my matrices is really sparse. Sorry for beign stupid and not mention that.

Definitely!

Subject: Optimizing an algorithm

From: lesodk Zokla

Date: 5 Feb, 2009 14:22:02

Message: 14 of 16

"Stefan" <nospam@yahoo.com> wrote in message <gmem2e$9ut$1@fred.mathworks.com>...
> "lesodk Zokla" <lesodk@gmail.com> wrote in message <gmd2aq$9uh$1@fred.mathworks.com>...
> > "John D'Errico" <woodchips@rochester.rr.com> wrote in message <gmd0rt$mv9$1@fred.mathworks.com>...
> > > "lesodk Zokla" <lesodk@gmail.com> wrote in message <gmcv2a$m3u$1@fred.mathworks.com>...
> > > > Sorry guys, this is my first day on the forum and by accident i made multiple posts. Now here is the real problem:
> > > >
> > > > I have an algorithm which handles really big matrices. One of the things in the loop is this:
> > > >
> > > > ...
> > > > tmp = D*C' ;
> > > > H = C*tmp;
> > > > ...
> > > >
> > > > The reason why i'm using the tmp variable, is that it is used several other places in the algorithm. C is a matrix, without any specific structure beacuse it is an input to the algorithm. D is always diagonal, and furthermore i have a vector "d" which is holding all the diagonal elements, in fact i am creating D by using diag(d). Does anyone know how to make these lines more effecient. Maybe i don't even need to define D and can make the calcualtion from the vector d, with the diagonal elements? i don't know.
> > > >
> > > > Thanks alot!
> > > >
> > > > I tried calculating tmp by tmp = bsxfun(@times,D,C'), but that slows down the algorithm.
> > >
> > > Really? Why do I doubt your statement?
> > >
> > > d = rand(1000,1);
> > > D = diag(d);
> > > C = rand(1000);
> > >
> > > timeit(@() bsxfun(@times,d,C'))
> > > ans =
> > > 0.092805
> > >
> > > timeit(@() D*C')
> > > ans =
> > > 1.0038
> > >
> > > Perhaps you misused bsxfun, misunderstanding
> > > what it does and why it works. But I know that
> > > you used it improperly if you found it to slow
> > > down your code.
> > >
> > > John
> >
> > Could i be beacuse i'm using sparse matrices?
> >
> > All my matrices is really sparse. Sorry for beign stupid and not mention that.
>
> Definitely!

Both D and A are really sparse. So should i keep the calculation D*A' ?

Subject: Optimizing an algorithm

From: Matt

Date: 5 Feb, 2009 15:17:01

Message: 15 of 16

"lesodk Zokla" <lesodk@gmail.com> wrote in message <gmesma$l9b$1@fred.mathworks.com>...

> Both D and A are really sparse. So should i keep the calculation D*A' ?

NO!! Precompute A' outside the loop, as I've said already several times.

Subject: Optimizing an algorithm

From: lesodk Zokla

Date: 5 Feb, 2009 18:48:01

Message: 16 of 16

"Matt " <xys@whatever.com> wrote in message <gmevtd$96d$1@fred.mathworks.com>...
> "lesodk Zokla" <lesodk@gmail.com> wrote in message <gmesma$l9b$1@fred.mathworks.com>...
>
> > Both D and A are really sparse. So should i keep the calculation D*A' ?
>
> NO!! Precompute A' outside the loop, as I've said already several times.

This still slows down my algorithm.
But thanks, anyway.

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us