Thread Subject: Incredibly slowed down code when calling a function. How to efficientely pass part of a matrix?

Subject: Incredibly slowed down code when calling a function. How to efficientely pass part of a matrix?

From: Alberto

Date: 23 Jan, 2012 21:37:52

Message: 1 of 12

Dear all,
my code slows down when I call a function inside a for loop. Clearly it would be better to call a single function once without using a loop. I tried and it gets even slower. I am a fortran programmer but I have also been using basic matlab since 5 years. I created minimal examples of what I am doing (so you can copy and paste them in order to test the speed). I have a main program with 2 for loops. The external one is a time loop so it cannot be vectorized. The internal one can be vectorized. If I use 2-loops like in the following example

Niter =1000
Nlayers = 1000
C=rand(Nlayers,Niter);
F=zeros(Nlayers,Niter);
tic
for k=1:Niter %time iterations
    for m=1:Nlayers %space itarations
        F(m,k)=fluxesXmathworksSCALAR(C(m,k));
    end
    %TIME UPDATE
end
toc

function F = fluxesXmathworksSCALAR(C)
   F = (1-C)^3*(1-2*C)/(1+2.5*C)*C;
end

the Elapsed time is 6.143000 seconds. Instead, if I use directly the function inside the 2 loops in this way:

Niter =1000
Nlayers = 1000
C=rand(Nlayers,Niter);
F=zeros(Nlayers,Niter);
tic
for k=1:Niter %time iterations
        F(m,k)=(1-C(m,k))^3*(1-2*C(m,k))/(1+2.5*C(m,k))*C(m,k);
    %TIME UPDATE
end
toc

the elapsed time is 0.130000 seconds. 50 times faster. Up to here it is pretty obvious. If I vectorize the inner loop the code

Niter =1000
Nlayers = 1000
C=rand(Nlayers,Niter);
F=zeros(Nlayers,Niter);
tic
for k=1:Niter %time iterations
       F(:,k)=(1-C(:,k)).^3.*(1-2*C(:,k))./(1+2.5*C(:,k)).*C(:,k);
end
toc

the lapsed time is 0.072000 seconds, it is a bit faster but i would say the speed is of the same order (note that I previously allocated the matrices).
Finally, if i call the function just once and I pass the matrices I have

Niter =1000
Nlayers = 1000
C=rand(Nlayers,Niter);
F=zeros(Nlayers,Niter);
tic
for k=1:Niter %time iterations
        F=fluxesXmathworksVECT(C,F);
end
toc

function F = fluxesXmathworksVECT(C,F)
    F=(1-C).^3.*(1-2*C)./(1+2.5*C).*C;
end

and the elapsed time is 73.122000 seconds. Slower than ever!!!! I guess this is due to the fact that I pass F as input, then matlab modifies it and then writes it as output. But I would like to pass only that particular k column of F (and maybe C?) that the function needs. I checked chapter 4 of the manual http://www.mathworks.com/help/pdf_doc/matlab/matlab_prog.pdf but I did not find a part where it explains how to pass a particular column of a matrix to a function and to have it as an output. In fortran i would have simply created a subroutine fluxesXmathworksVECT(C(:,k),F(:,k)) in which I change only that column. Please any suggestion is welcome.
thanks
Alberto

Subject: Incredibly slowed down code when calling a function. How to efficientely pass part of a matrix?

From: James Tursa

Date: 23 Jan, 2012 22:05:10

Message: 2 of 12

"Alberto" wrote in message <jfkjvg$8bh$1@newscl01ah.mathworks.com>...
>
> But I would like to pass only that particular k column of F (and maybe C?) that the function needs. I checked chapter 4 of the manual http://www.mathworks.com/help/pdf_doc/matlab/matlab_prog.pdf but I did not find a part where it explains how to pass a particular column of a matrix to a function and to have it as an output. In fortran i would have simply created a subroutine fluxesXmathworksVECT(C(:,k),F(:,k)) in which I change only that column. Please any suggestion is welcome.

I haven't looked at your code in any detail but I will answer your basic question about passing columns. In short, MATLAB cannot do this without making a data copy of the column. In Fortran the syntax you show will typically result in just a base address being passed (or a descriptor with a base address) to the subroutine and then stuff gets modified in-place inside the subroutine. In MATLAB the same syntax will result in a copy of the column being made, and that *copy* is passed to the function. Upon return the result is copied back into your original array. So there will be some unwanted (at least by you) data copying going on. The only way around this is to resort to mex routines where you can get at the original variable data and modify it in-place. This work-around is only recommended for advanced users because of the potential shared data copy side effects. Bruno Luong has produced
a mex routine that gives you column access ... you can find it here:

http://www.mathworks.com/matlabcentral/fileexchange/24576-inplacearray-a-semi-pointer-package-for-matlab

But heed the cautions in this package before using it.

James Tursa

Subject: Incredibly slowed down code when calling a function. How to

From: TideMan

Date: 23 Jan, 2012 22:05:46

Message: 3 of 12

On Jan 24, 10:37 am, "Alberto " <canestre...@idra.unipd.it> wrote:
> Dear all,
> my code slows down when I call a function inside a for loop. Clearly it would be better to call a single function once without using a loop. I tried and it gets even slower. I am a fortran programmer but  I have also been using basic matlab since 5 years. I created  minimal examples of what I am doing (so you can copy and paste them in order to test the speed). I have a main program with 2 for loops. The external one is a time loop so it cannot be vectorized. The internal one can be vectorized. If I use 2-loops like in the following example
>
> Niter =1000
> Nlayers = 1000
> C=rand(Nlayers,Niter);
> F=zeros(Nlayers,Niter);
> tic
> for k=1:Niter %time iterations
>     for m=1:Nlayers %space itarations
>         F(m,k)=fluxesXmathworksSCALAR(C(m,k));
>     end
>     %TIME UPDATE
> end
> toc
>
> function F = fluxesXmathworksSCALAR(C)
>    F  = (1-C)^3*(1-2*C)/(1+2.5*C)*C;
> end
>
> the Elapsed time is 6.143000 seconds.  Instead, if I use directly the function inside the 2 loops in this way:
>
> Niter =1000
> Nlayers = 1000
> C=rand(Nlayers,Niter);
> F=zeros(Nlayers,Niter);
> tic
> for k=1:Niter %time iterations
>         F(m,k)=(1-C(m,k))^3*(1-2*C(m,k))/(1+2.5*C(m,k))*C(m,k);
>     %TIME UPDATE
> end
> toc
>
> the elapsed time is 0.130000 seconds. 50 times faster. Up to here it is pretty obvious. If I vectorize the inner loop the code
>
> Niter =1000
> Nlayers = 1000
> C=rand(Nlayers,Niter);
> F=zeros(Nlayers,Niter);
> tic
> for k=1:Niter %time iterations
>        F(:,k)=(1-C(:,k)).^3.*(1-2*C(:,k))./(1+2.5*C(:,k)).*C(:,k);
> end
> toc
>
> the lapsed time is 0.072000 seconds, it is  a bit faster but i would say the speed is of the same order (note that I previously allocated the matrices).
> Finally, if i call the function just once and I pass the matrices I have
>
> Niter =1000
> Nlayers = 1000
> C=rand(Nlayers,Niter);
> F=zeros(Nlayers,Niter);
> tic
> for k=1:Niter %time iterations
>         F=fluxesXmathworksVECT(C,F);
> end
> toc
>
> function F = fluxesXmathworksVECT(C,F)
>     F=(1-C).^3.*(1-2*C)./(1+2.5*C).*C;
> end
>
> and the elapsed time is 73.122000 seconds. Slower than ever!!!! I guess this is due to the fact that I pass F as input, then matlab modifies it and then writes it as output. But I would like to pass only that particular k column of F (and maybe C?) that the function needs. I checked chapter 4 of the manualhttp://www.mathworks.com/help/pdf_doc/matlab/matlab_prog.pdf but I did not find a part where it explains how to pass a particular column of a matrix to a function and to have it as an output. In fortran i would have simply created a subroutine fluxesXmathworksVECT(C(:,k),F(:,k)) in which I change only that column. Please any suggestion is welcome.
> thanks
> Alberto

You're making things much too complicated.
You can do it all in two lines:
C=rand(Nlayers,Niter);
F=(1-C).^3.*(1-2*C)./(1+2.5*C).*C;

Indeed, in your last example, you simply repeat the above line 1000
times!!!
Why would you want to do that?

You also have some fixation that Matlab syntax is different from
Matlab.
It is not. Indeed, I regularly convert algorithms I've developed in
Matlab into Fortran by simply copying line by line, removing the
semicolons.

Subject: Incredibly slowed down code when calling a function. How to

From: James Tursa

Date: 23 Jan, 2012 22:19:12

Message: 4 of 12

TideMan <mulgor@gmail.com> wrote in message <fd1790f7-01e1-4127-9968-7ad54ea37325@b23g2000yqn.googlegroups.com>...
>
> Indeed, I regularly convert algorithms I've developed in
> Matlab into Fortran by simply copying line by line, removing the
> semicolons.

You don't need to remove the semi-colons, since they are valid Fortran syntax.

James Tursa

Subject: Incredibly slowed down code when calling a function. How to

From: Alberto

Date: 23 Jan, 2012 22:34:11

Message: 5 of 12

thanks James Tursa. Are u saying that Matlab is really such a bad programming language? I thank you a lot for the link you provided me but it seems really too complicated.
For tideman: something i do silly things, but i am not so stupid to repeat the same line 1000 times. clearly as i said this was only a minimal example easy for you to read, and for me to check the CPU time. In my case where i wrote %TIME UPDATE there is the update of C!! so C is different each time step (each k iteration). And i have to call a function because in my code my function is something like

function F = fluxesVEC(C,w0,Csoil,k,F)
global Cgel mWINT settlFORM rhos CgelDIVrhos
if settlFORM==1
   F(:,k) = -w0*(1-C(:,k)/Csoil)^5.*C;
elseif settlFORM==2
    Phi = C(:,k)/Cgel;
   F(:,k) = -w0*(1-Phi).^mWINT.*(1-CgelDIVrhos*Phi)./(1+2.5*Phi).*C(:,k);
elseif settlFORM==3
    Phi = C(:,k)/Cgel;
   F(:,k) = -w0*(1-2*Phi).^mWINT.*(1-CgelDIVrhos*Phi)./(1+5.5*Phi).*C(:,k);
end


so i compute a different flux for each different settlFORM. I sent you guys only a MINIMAL EXAMPLE. Doing this was helping me to figure out that having the if inside the function does not slow down the computation noticeably. Any suggestion how to pass the matrix in a efficient way?100 times slower is very slow!!!

Subject: Incredibly slowed down code when calling a function. How to

From: dpb

Date: 23 Jan, 2012 22:47:17

Message: 6 of 12

On 1/23/2012 4:34 PM, Alberto wrote:
> thanks James Tursa. Are u saying that Matlab is really such a bad
> programming language? ...
>
> so i compute a different flux for each different settlFORM. I sent you
> guys only a MINIMAL EXAMPLE. Doing this was helping me to figure out
> that having the if inside the function does not slow down the
> computation noticeably. Any suggestion how to pass the matrix in a
> efficient way?100 times slower is very slow!!!

I wouldn't say Matlab is such a bad language, only that it is a rapid
_development_ tool and therefore gives up runtime performance in lieu of
saving prototyping development time vis a vis some other languages such
as Fortran.

The case of arguments not being modified in place in Matlab is one of
those design tradeoffs. If you really want the code to run quickly in
Matlab w/o mex for the choke points, the only thing that comes to mind
here w/ the constraints given would be to make the array itself GLOBAL
so to avoid the copy. Not an ideal solution from a code purist
standpoint, but might give it a go and see about the performance.

--

Subject: Incredibly slowed down code when calling a function. How to

From: Alberto

Date: 23 Jan, 2012 22:53:10

Message: 7 of 12

Sorry i the last minimal example in my first post was actually wrong. I thing that the correct sintax is:

Niter =1000
Nlayers = 1000
C=rand(Nlayers,Niter);
F=zeros(Nlayers,Niter);
tic
for k=1:Niter %time iterations

        F=fluxesXmathworksVECT(C,F,k);
    %TIME UPDATE
end
toc

function F = fluxesXmathworksVECT(C,F,k)
    F(:,k)=(1-C(:,k)).^3.*(1-2*C(:,k))./(1+2.5*C(:,k)).*C(:,k);
end

i clearly had to call a function that computes fluxes F from concentrations C at each time step k. In this case the elapsed time is 4.032 seconds. Still a bad time since it is 56 times slower than the vectorized case without function (t=0.072 s, see above). Any suggestions to improve the performance with what a very expensive product has built in, without recurring to extra tools?
thank you very much for your answers.
alberto

Subject: Incredibly slowed down code when calling a function. How to

From: TideMan

Date: 23 Jan, 2012 23:53:38

Message: 8 of 12

On Jan 24, 11:19 am, "James Tursa"
<aclassyguy_with_a_k_not_...@hotmail.com> wrote:
> TideMan <mul...@gmail.com> wrote in message <fd1790f7-01e1-4127-9968-7ad54ea37...@b23g2000yqn.googlegroups.com>...
>
> >  Indeed, I regularly convert algorithms I've developed in
> > Matlab into Fortran by simply copying line by line, removing the
> > semicolons.
>
> You don't need to remove the semi-colons, since they are valid Fortran syntax.
>
> James Tursa

How about that!
I've been using Fortran since 1967 and I never knew that.

Subject: Incredibly slowed down code when calling a function. How to

From: Alberto

Date: 24 Jan, 2012 00:18:09

Message: 9 of 12

Thanks dpb. i understand matlab favours to reduce prototyping development time, but having a code that is 50-100 time slower is a big gap. i can understand 2-3 times slower. Anyways i tried different options. I tried to avoid to store all the time steps and therefore i now declared F and C as vector,not array. I was expecting that with this trick i was improving something, but actually things are even worst. With this code

Niter =1000
Nlayers = 1000
C=rand(Nlayers);
F=zeros(Nlayers);
tic
for k=1:Niter %time iterations
        F=fluxesXmathworksVECTnoMATRIX(C,F);
    %TIME UPDATE
end
toc

function F = fluxesXmathworksVECTnoMATRIX(C,F)
    F=(1-C).^3.*(1-2*C)./(1+2.5*C).*C;
end

the elapsed time is now 74.232 seconds!! So in the case one person does not need to store the time steps, the code is 1000 times slower than the case without function. i then tried to follow your suggestion to declare F as global. If i do it in without storing time steps k i have that the code looks like

Niter =1000
Nlayers = 1000
global F
C=rand(Nlayers);%,Niter
F=zeros(Nlayers);%,Niter
for k=1:Niter %time iterations
       fluxesXmathworksVECTnoMATRIX(C);
    %TIME UPDATE
end
toc

function fluxesXmathworksVECTnoMATRIX(C)
     global F
     F=(1-C).^3.*(1-2*C)./(1+2.5*C).*C;
end

and the elapsed time is now 75.928 seconds. If I store the time steps as:

Niter =1000
Nlayers = 1000
global F
tic
C=rand(Nlayers,Niter);
F=zeros(Nlayers,Niter);
for k=1:Niter %time iterations
        fluxesXmathworksVECTnoMATRIX(C,k);
    %TIME UPDATE
end
toc

function fluxesXmathworksVECTnoMATRIX(C,k)
     global F
     F(:,k)=(1-C(:,k)).^3.*(1-2*C(:,k))./(1+2.5*C(:,k)).*C(:,k);
end

the elapsed time is now 0.108000 seconds!! So it looks that global does the trick, but only if you store all the results. Why is this happening? For very long application you could run out of memory for storing all time steps (but it is useful for me now that i am in a debugging phase). I was not able to find a trick for speed the code up without storing the time steps. If I do not store them and i use a scratch variable Fprov like:


Niter =1000
Nlayers = 1000
tic
C=rand(Nlayers);%,Niter
F=zeros(Nlayers);%,Niter
Fprov = zeros(Nlayers);%
for k=1:Niter %time iterations
        Fprov = fluxesXmathworksVECTnoMATRIX(C);
        F=Fprov
    %TIME UPDATE
end
toc

function Fprov = fluxesXmathworksVECTnoMATRIX(C)
     Fprov=(1-C).^3.*(1-2*C)./(1+2.5*C).*C;
end



 it still takes 72 seconds. Anybody knows what is going on here? Why is it so slow?any way to have it fast without storing the k time steps?
thanks
Alberto

Subject: Incredibly slowed down code when calling a function. How to

From: dpb

Date: 24 Jan, 2012 01:00:58

Message: 10 of 12

On 1/23/2012 6:18 PM, Alberto wrote:
> Thanks dpb. i understand matlab favours to reduce prototyping
> development time, but having a code that is 50-100 time slower is a big
> gap. i can understand 2-3 times slower.

Well, perhaps but it all depends on what you choose to use as the
comparison. You've picked a very particular case of writing code that
works in a way that is against Matlab's choice of how to operate--that
is, having a function alter an argument array.

> ...Anyways i tried different
> options. I tried to avoid to store all the time steps and therefore i
> now declared F and C as vector,not array. I was expecting that with this
> trick i was improving something, but actually things are even worst.
> With this code
>
> Niter =1000
> Nlayers = 1000
> C=rand(Nlayers); F=zeros(Nlayers);

You _DO_ realize that after the above that C and F are 1000x1000 each
don't you?

doc rand
doc zeros

If you want vectors, you need to say so...

C=rand(Nlayers,1);
F=zeros(Nlayers,1);

  tic
> for k=1:Niter %time iterations
> F=fluxesXmathworksVECTnoMATRIX(C,F);
> %TIME UPDATE
> end
> toc
>
> function F = fluxesXmathworksVECTnoMATRIX(C,F)
> F=(1-C).^3.*(1-2*C)./(1+2.5*C).*C; end

Has the same problem of copy on write in the function arguments as well
as the dimensions are bigger than you think which has to be part of the
timing problem as well...

...

> ... i then tried to follow your suggestion to declare
> F as global. If i do it in without storing time steps k i have that the
> code looks like
>
   C=rand(Nlayers);%,Niter
   F=zeros(Nlayers);%,Niter

Same problem on dimensions above...
> for k=1:Niter %time iterations
> fluxesXmathworksVECTnoMATRIX(C);
> %TIME UPDATE
> end
> toc
>
> function fluxesXmathworksVECTnoMATRIX(C)
> global F
     F=(1-C).^3.*(1-2*C)./(1+2.5*C).*C;

  end
>
> and the elapsed time is now 75.928 seconds. If I store the time steps as:
>
> Niter =1000
> Nlayers = 1000
> global F
> tic
> C=rand(Nlayers,Niter);
> F=zeros(Nlayers,Niter);
> for k=1:Niter %time iterations
> fluxesXmathworksVECTnoMATRIX(C,k);
> %TIME UPDATE
> end
> toc
>
> function fluxesXmathworksVECTnoMATRIX(C,k)
> global F F(:,k)=(1-C(:,k)).^3.*(1-2*C(:,k))./(1+2.5*C(:,k)).*C(:,k); end
>
> the elapsed time is now 0.108000 seconds!! So it looks that global does
> the trick, but only if you store all the results. Why is this happening?

I think there's something else happening...fix your dimensions, restart
and try again. If you don't futz around w/ the argument array it
shouldn't make as much difference I wouldn't think.

There is some overhead in the loop calling the function but shouldn't be
excessive unless you're causing a copy to occur.

--

Subject: Incredibly slowed down code when calling a function. How to

From: Alberto

Date: 24 Jan, 2012 02:43:09

Message: 11 of 12

Thanks dpb. I forgot that. So if i now initialize correctly with rand and zeros and i do not save time steps

Niter =1000
Nlayers = 1000
C=rand(Nlayers,1);
F=zeros(Nlayers,1);
tic
for k=1:Niter %time iterations
        F=fluxesXmathworksVECTnoMATRIX(C,F);
    %TIME UPDATE
end
toc

function F = fluxesXmathworksVECTnoMATRIX(C,F)
    F=(1-C).^3.*(1-2*C)./(1+2.5*C).*C;
end

the elapsed time is 0.115 seconds. The same if I declare F as global ( I launched them 10 times each and the average is always about 0.115 secs). Declaring F as global helps a bit if I store the time steps , infact if i do not declare F as global i have that the code reads:

Niter =1000
Nlayers = 1000
tic
C=rand(Nlayers,Niter);
F=zeros(Nlayers,Niter);
for k=1:Niter %time iterations
       F(:,k) = fluxesXmathworksVECT(C,k);
    %TIME UPDATE
end
toc

function F = fluxesXmathworksVECT(C,k)
     F = (1-C(:,k)).^3.*(1-2*C(:,k))./(1+2.5*C(:,k)).*C(:,k);
end

and the elapsed time is 0.144 seconds (in the previous message i erroneously called the function writing F=fluxesXmathworksVECT(C,k) instead of F(:,k)=fluxesXmathworksVECT(C,k) that is way it took 4 seconds). If I declare F global:

Niter =1000
Nlayers = 1000
global F
tic
C=rand(Nlayers,Niter);
F=zeros(Nlayers,Niter);
for k=1:Niter %time iterations
       fluxesXmathworksVECT(C,k);
    %TIME UPDATE
end
toc

function fluxesXmathworksVECT(C,k)
global F
     F(:,k) = (1-C(:,k)).^3.*(1-2*C(:,k))./(1+2.5*C(:,k)).*C(:,k);
end

the elapsed time is 0.108 seconds. So it looks like the code is just a bit faster with F declared as global (0.144/0.108=1.33 times faster). So maybe its better no to use global variables (and so have more general functions) and lose a 30% of speed. Do you guys agree? What do you guys think that is the cause of this loss of performance?the fact that if F is not global, the matlab function creates a copy of F? Copy that it is not created if global? I think that there is nothing to do to improve this, except resorting to mex routines as james suggested. What is the best documentation of mex routines to start with?do i have basically to create a fortran subroutine and call it from matlab?thanks

Subject: Incredibly slowed down code when calling a function. How to

From: Bruno Luong

Date: 24 Jan, 2012 07:29:10

Message: 12 of 12

"Alberto" wrote in message <jfl5rt$1ra$1@newscl01ah.mathworks.com>...

> the elapsed time is 0.108 seconds. So it looks like the code is just a bit faster with F declared as global (0.144/0.108=1.33 times faster). So maybe its better no to use global variables (and so have more general functions) and lose a 30% of speed. Do you guys agree? What do you guys think that is the cause of this loss of performance?the fact that if F is not global, the matlab function creates a copy of F? Copy that it is not created if global?

No, the speed gain is not due to F is global, but rather F is no longer pass as OUTPUT: otherwise F is created during function is performed, then copy into the right column.

That overhead cost time. That is the reason one avoid to call function inside a loop.

>I think that there is nothing to do to improve this,

As C(:,k)) appears 4 times in RHS, Matlab _might_ actually create 4 copies of C(:,k). Matlab might be smart enough to optimize it, but I have no idae. You could try to create the column once, and use later:

Ck = C(:,k)
 F(:,k) = (1-Ck).^3.*(1-2*Ck)./(1+2.5*Ck).*Ck;

> except resorting to mex routines as james suggested. What is the best documentation of mex routines to start with?do i have basically to create a fortran subroutine and call it from matlab?thanks

Start to read the MEX tutorial and examples.

Bruno (author of this inplacevector package)

Tags for this Thread

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

rssFeed for this Thread

Contact us at files@mathworks.com