Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
split one line computation without duplicating data

Subject: split one line computation without duplicating data

From: Juliette Salexa

Date: 9 Aug, 2010 19:58:04

Message: 1 of 16

Hello,

The profiler says that this line of my code is taking up 50% of the time, (it's called many times in a forloop):

A=reshape(bsxfun(@times,reshape(sum(reshape(A,[],N^2),2),1,[]),I),[],1);

where size(I)=size(A) = [N^12 , 1]

Someone (Walter Roberson) once said on this Newsgroup that executing many functions on one line like this is slower than doing them on separate lines, because the JIT compiler finds it more difficult to optimize the code if it's all done on one line.

My concern is that splitting this one-line computation into separate lines would involve making new variables, and I don't have enough memory for this.

Is there a way to speed up a line like this without increasing the amount of memory used ?

I remember reading a blog post by Loren Shure which taught us how to call functions with input variables without duplicating the memory used, but I search and can't find it anymore.

Any discussion about this would be very appreciated,
Juliette

Subject: split one line computation without duplicating data

From: us

Date: 9 Aug, 2010 20:15:27

Message: 2 of 16

"Juliette Salexa" <juliette.physicist@gmail.com> wrote in message <i3pmkc$oj5$1@fred.mathworks.com>...
> Hello,
>
> The profiler says that this line of my code is taking up 50% of the time, (it's called many times in a forloop):
>
> A=reshape(bsxfun(@times,reshape(sum(reshape(A,[],N^2),2),1,[]),I),[],1);
>
> where size(I)=size(A) = [N^12 , 1]
>
> Someone (Walter Roberson) once said on this Newsgroup that executing many functions on one line like this is slower than doing them on separate lines, because the JIT compiler finds it more difficult to optimize the code if it's all done on one line.
>
> My concern is that splitting this one-line computation into separate lines would involve making new variables, and I don't have enough memory for this.
>
> Is there a way to speed up a line like this without increasing the amount of memory used ?
>
> I remember reading a blog post by Loren Shure which taught us how to call functions with input variables without duplicating the memory used, but I search and can't find it anymore.
>
> Any discussion about this would be very appreciated,
> Juliette

a hint:
- we simply look at things with the help of...

     help profile;

us

Subject: split one line computation without duplicating data

From: Walter Roberson

Date: 9 Aug, 2010 20:15:15

Message: 3 of 16

Juliette Salexa wrote:

> The profiler says that this line of my code is taking up 50% of the
> time, (it's called many times in a forloop):
>
> A=reshape(bsxfun(@times,reshape(sum(reshape(A,[],N^2),2),1,[]),I),[],1);
> where size(I)=size(A) = [N^12 , 1]

Could you confirm that that size() you gave is N^12 and not N^2 ? That you
are, in other words, expecting that innermost reshape to result in a matrix
which is N^10 by N^2 and that summing along the second dimension would thus
result in a column vector of N^10 elements? Which you then flip over to be a
row vector 1 x N^10 and use bsxfun to produce the product of all pairs of that
row vector together with the column vector I which is N^12 x 1, thus producing
a matrix which is N^12 by N^10 ? and then you convert the result into a single
column vector? If so, then I'm not surprised that it is taking a long time for
any non-trivial N.

In your "for" loop, what is varying? I'm wondering about the possibility of
pre-calculating some of the results. Also, is that exact output order
important? If not, then it might be possible to eliminate one of two of the
transposes.

Subject: split one line computation without duplicating data

From: Sean

Date: 9 Aug, 2010 20:17:05

Message: 4 of 16

"Juliette Salexa" <juliette.physicist@gmail.com> wrote in message <i3pmkc$oj5$1@fred.mathworks.com>...
> Hello,
>
> The profiler says that this line of my code is taking up 50% of the time, (it's called many times in a forloop):
>
> A=reshape(bsxfun(@times,reshape(sum(reshape(A,[],N^2),2),1,[]),I),[],1);
>
> where size(I)=size(A) = [N^12 , 1]
>
> Someone (Walter Roberson) once said on this Newsgroup that executing many functions on one line like this is slower than doing them on separate lines, because the JIT compiler finds it more difficult to optimize the code if it's all done on one line.
>
> My concern is that splitting this one-line computation into separate lines would involve making new variables, and I don't have enough memory for this.
>
> Is there a way to speed up a line like this without increasing the amount of memory used ?
>
> I remember reading a blog post by Loren Shure which taught us how to call functions with input variables without duplicating the memory used, but I search and can't find it anymore.
>
> Any discussion about this would be very appreciated,
> Juliette

What size does this line result in?
>>reshape(sum(reshape(A,[],N^2),2),1,[])
I've been trying it and ending up with scalars which removes the need for bsxfun entirely. I may be (am probably ;-) ) creating bad data though. Can you produce a small amount of sample data?

Subject: split one line computation without duplicating data

From: Walter Roberson

Date: 9 Aug, 2010 20:25:14

Message: 5 of 16

Juliette Salexa wrote:

> My concern is that splitting this one-line computation into separate
> lines would involve making new variables, and I don't have enough memory
> for this.
>
> Is there a way to speed up a line like this without increasing the
> amount of memory used ?

Each intermediate step in a computation creates an (unnamed) variable that is
used for the computation purposes. If that variable is the result of the last
step in the computation and it is being assigned to a plain variable, then any
previous plain variable by that name is discarded and the name is associated
with the header for that unnamed variable _without copying the data_. If the
unnamed variable is not the last step in the computation, then at some point
(probably before the next line) the unnamed variable is released.

Thus, there is virtually no cost to creating a temporary variable of
intermediate results and clear'ing that variable once it is no longer needed.
Trivial extra memory is used: all that happens is that a name slot gets
associated with the data block that is already there anyhow.

The one disadvantage of this approach is that sometimes Matlab has fast
routines for some more complex operations -- for example A*B+C can be done
more quickly than T1 = A * B; T2 = T1 + C; clear T1

Subject: split one line computation without duplicating data

From: Andy

Date: 9 Aug, 2010 20:43:06

Message: 6 of 16

Some food for thought:


    a=rand(10000,1);

    tic
    b=reshape(a,100,100);
    toc

    clear b;
    tic
    b=reshape(a,100,[]);
    toc

    clear b;
    tic
    b=reshape(a,[],100);
    toc

    %{
    displays:
    Elapsed time is 0.000015 seconds.
    Elapsed time is 0.000328 seconds.
    Elapsed time is 0.000313 seconds.
    %}

Implicit arguments to reshape take more than 20 times longer. Also:

    a=rand(10000,1);

    clear b;
    tic
    b=reshape(a,1,[]); % this is what you did
    toc
    
    tic
    b=reshape(a,1,10000); % without implicit arguments
    toc

    clear b;
    tic
    b=a.'; % fastest and clearest
    toc

    %{
    displays:
    Elapsed time is 0.001034 seconds.
    Elapsed time is 0.000019 seconds.
    Elapsed time is 0.000012 seconds.
    %}

Using the transpose operator is about 86 times faster.

Subject: split one line computation without duplicating data

From: Matt Fig

Date: 9 Aug, 2010 20:50:21

Message: 7 of 16

Walter Roberson <roberson@hushmail.com> wrote in message
> Each intermediate step in a computation creates an (unnamed) variable that is
> used for the computation purposes. If that variable is the result of the last
> step in the computation and it is being assigned to a plain variable, then any
> previous plain variable by that name is discarded and the name is associated
> with the header for that unnamed variable _without copying the data_. If the
> unnamed variable is not the last step in the computation, then at some point
> (probably before the next line) the unnamed variable is released.
>
> Thus, there is virtually no cost to creating a temporary variable of
> intermediate results and clear'ing that variable once it is no longer needed.
> Trivial extra memory is used: all that happens is that a name slot gets
> associated with the data block that is already there anyhow.
>
> The one disadvantage of this approach is that sometimes Matlab has fast
> routines for some more complex operations -- for example A*B+C can be done
> more quickly than T1 = A * B; T2 = T1 + C; clear T1

What about using the same variable name?

E = sum(reshape(A,size(A,1)/N^2,N^2),2);
E = bsxfun(@times,E.',I);
E = E(:);

Subject: split one line computation without duplicating data

From: Andy

Date: 9 Aug, 2010 21:01:20

Message: 8 of 16

"Matt Fig" <spamanon@yahoo.com> wrote in message <i3ppmd$art$1@fred.mathworks.com>...
> Walter Roberson <roberson@hushmail.com> wrote in message
> > Each intermediate step in a computation creates an (unnamed) variable that is
> > used for the computation purposes. If that variable is the result of the last
> > step in the computation and it is being assigned to a plain variable, then any
> > previous plain variable by that name is discarded and the name is associated
> > with the header for that unnamed variable _without copying the data_. If the
> > unnamed variable is not the last step in the computation, then at some point
> > (probably before the next line) the unnamed variable is released.
> >
> > Thus, there is virtually no cost to creating a temporary variable of
> > intermediate results and clear'ing that variable once it is no longer needed.
> > Trivial extra memory is used: all that happens is that a name slot gets
> > associated with the data block that is already there anyhow.
> >
> > The one disadvantage of this approach is that sometimes Matlab has fast
> > routines for some more complex operations -- for example A*B+C can be done
> > more quickly than T1 = A * B; T2 = T1 + C; clear T1
>
> What about using the same variable name?
>
> E = sum(reshape(A,size(A,1)/N^2,N^2),2);
> E = bsxfun(@times,E.',I);
> E = E(:);

Note: it was mentioned that this line is being called many times in a loop. Assuming A is not changing size on each iteration, call size(A,1) once outside the loop to save a little more time.

Subject: split one line computation without duplicating data

From: Matt Fig

Date: 9 Aug, 2010 21:08:19

Message: 9 of 16

"Andy " <myfakeemailaddress@gmail.com> wrote in message
> Note: it was mentioned that this line is being called many times in a loop. Assuming A is not changing size on each iteration, call size(A,1) once outside the loop to save a little more time.

Ah yes, I meant that as a place holder only. If A is always the same size, these values should be put in directly instead of either calculating them explicitly as I did, or using [] as the OP did.

Subject: split one line computation without duplicating data

From: Juliette Salexa

Date: 10 Aug, 2010 00:25:22

Message: 10 of 16

Walter Roberson <roberson@hushmail.com> wrote in message
> Could you confirm that that size() you gave is N^12 and not N^2 ?

Yes. That's not a typo.

> That you
> are, in other words, expecting that innermost reshape to result in a matrix
> which is N^10 by N^2 and that summing along the second dimension would thus
> result in a column vector of N^10 elements? Which you then flip over to be a
> row vector 1 x N^10 and use bsxfun to produce the product of all pairs of that
> row vector together with the column vector I which is N^12 x 1, thus producing
> a matrix which is N^12 by N^10 ?

I'm VERY sorry, I made a crucial mistake in my description.
size(I) = [N^2, N^10]

so the size of A never changes. The summation makes it a factor of N^2 smaller, then the bsxfun returns it to its original size. This happens repeatedly.

> In your "for" loop, what is varying? I'm wondering about the possibility of
> pre-calculating some of the results. Also, is that exact output order
> important? If not, then it might be possible to eliminate one of two of the
> transposes.

The forloop looks like:

for i=1:100
 A=reshape(bsxfun(@times,reshape(sum(reshape(A,[],N^2),2),1,[]),I),[],1);
 B(i)=summation of certain elements of A
end

I'm not exactly sure what can be precalculated.
Perhaps powers of I ?
I'm not sure if that helps, since each power of I has to be independently multiplied by the summed values of A anyway.

As for the transposes, Doug Schwarz once told me on this newsgroup:

"Reshape uses negligible time. All it does is change the array header that stores how many rows and columns there are. It doesn't move any data. It is always very fast no
matter how large the argument. Always. (Did I emphasize that enough?)"

so I'm not sure how much time could be saved by eliminating transposes.

==================

"Sean " <sean.dewolski@nospamplease.umit.maine.edu> wrote in message

> What size does this line result in?
>>reshape(sum(reshape(A,[],N^2),2),1,[])
> I've been trying it and ending up with scalars which removes the need for bsxfun > >entirely. I may be (am probably ;-) ) creating bad data though. Can you produce a >small amount of sample data?

It should just be the original size of A, divided by N^2 (and transposed)

If you try: reshape(sum(reshape(ones(256,1),[],4),2),1,[])
you'll get a 1,64 vector (not a scalar)

==================
Walter Roberson <roberson@hushmail.com> wrote in message

> for example A*B+C can be done
> more quickly than T1 = A * B; T2 = T1 + C; clear T1

if I have 32GB of memory, and
If A,B, and C are 10GB matrices (30GB is being used and I don't have space for a 4th 10GB array)

I would hope that doing
A=A*B+C

would not create any new arrays taking up memory. The 10GB worth of data in A should just get modified, twice.

Doing T1 = A * B; T2 = T1 + C; clear T1 would be impossible because although T1 is meant to be cleared very soon, the computer ran out of memory before it had a chance to be cleared.

Are you saying that at each step of:

A=reshape(bsxfun(@times,reshape(sum(reshape(A,[],N^2),2),1,[]),I),[],1);

a temporary variable is created ?

I think you're right.
Although I'm convinced that I remember seeing a blog post by Loren Shure which described how to avoid this, but for some reason I can't find it anymore =(

I'm hoping there's a way to avoid that temporary variable because theoretically it's not necessary (if the data in A is just modified rather than keeping both A and the result), and due to the size of A and I, my computer will run out of memory if an unnecessary temporary variable is created.
=====================
"Andy " <myfakeemailaddress@gmail.com> wrote in message

>Using the transpose operator is about 86 times faster.

But the bsxfun and sum take in total about 5000 seconds, so I don't think using [] verses precalculating will make a difference.

And I was recommended reshape vs transpose because reshape doesn't move data, while I've been told that the transpose is a nightmare for large matrices since matlab likes to store things column-wise (but I don't know this stuff as well as I wish I did!)



====
thanks again everyone that replied,
I'll keep looking for that blog post which I'm convinced I've seen before.

Subject: split one line computation without duplicating data

From: Walter Roberson

Date: 10 Aug, 2010 04:03:36

Message: 11 of 16

Juliette Salexa wrote:

> The forloop looks like:
>
> for i=1:100
> A=reshape(bsxfun(@times,reshape(sum(reshape(A,[],N^2),2),1,[]),I),[],1);
> B(i)=summation of certain elements of A
> end

The calculation of A is not dependent upon the value of i, and thus the
entire calculation of A can be taken out of the loop.

Subject: split one line computation without duplicating data

From: Juliette Salexa

Date: 10 Aug, 2010 11:47:05

Message: 12 of 16

Walter Roberson <roberson@hushmail.com> wrote in message <sc48o.3830$1v3.2484@newsfe20.iad>...
> Juliette Salexa wrote:
>
> > The forloop looks like:
> >
> > for i=1:100
> > A=reshape(bsxfun(@times,reshape(sum(reshape(A,[],N^2),2),1,[]),I),[],1);
> > B(i)=summation of certain elements of A
> > end
>
> The calculation of A is not dependent upon the value of i, and thus the
> entire calculation of A can be taken out of the loop.

Hello, I'm not sure what you mean here.

My forloop is like:

i=1:100
A=A+A^3
B(i)=A
end

The calculation doesn't explicitly depend on i, but if I take it out of the forloop, I won't be able to get the next value of B (i think..)

Subject: split one line computation without duplicating data

From: Matt Fig

Date: 10 Aug, 2010 13:41:04

Message: 13 of 16

Walter Roberson <roberson@hushmail.com> wrote in message
> The calculation of A is not dependent upon the value of i, and thus the
> entire calculation of A can be taken out of the loop.

Walter,
Though the value of A is not dependent directly on the loop index, the value of A may change with each index since the value of A at each iteration is dependent on the value of A in the previous iteration.

Juliette,
I think we may need to know more about what goes on with B in order to help optimize your code. Do the values in A tend asymptotically towards a value, or oscillate about some value?

Subject: split one line computation without duplicating data

From: Andy

Date: 10 Aug, 2010 14:03:03

Message: 14 of 16

> Although I'm convinced that I remember seeing a blog post by Loren Shure which described how to avoid this, but for some reason I can't find it anymore =(

Could it have been one of these?:

http://blogs.mathworks.com/loren/2007/03/22/in-place-operations-on-data/
http://blogs.mathworks.com/loren/2006/05/10/memory-management-for-functions-and-variables/
http://blogs.mathworks.com/loren/2006/03/01/more-on-expansion-arrayfun/

I also agree with Matt Fig that we need more information about B. If you only need certain elements of A in each iteration, it's possible there's a way to calculate only those elements, which could save lots of time and space.

Subject: split one line computation without duplicating data

From: Juliette Salexa

Date: 15 Aug, 2010 22:13:03

Message: 15 of 16

"Andy " <myfakeemailaddress@gmail.com> wrote in message <i3rm6n$ea9$1@fred.mathworks.com>...
>
> Could it have been one of these?:
>
> http://blogs.mathworks.com/loren/2007/03/22/in-place-operations-on-data/


YES! Thank you Andy, the first one was exactly the one I was looking for! Excellent!

So according to that, my line was okay in terms of not duplicating any memory,
unless bsxfun makes an unnecessarily big temporary array within the function (I know some functions like KRON are designed to work for the most general case and consequently can be very suboptimal for simpler cases).

As for my original question,
I was worried that doing this in one line:

A=reshape(bsxfun(@times,reshape(sum(reshape(A,[],N^2),2),1,[]),I),[],1);

would be slow since the JIT compiler finds it hard to optimize nested functions like this.
And I had asked:

"My concern is that splitting this one-line computation into separate lines would involve making new variables, and I don't have enough memory for this.

Is there a way to speed up a line like this without increasing the amount of memory used ?"

So i think Matt Fig's suggestion was the closest, if E was replaced by A:

A = sum(reshape(A,size(A,1)/N^2,N^2),2);
A = bsxfun(@times,A.',I);
A = A(:);

I replaced the E by the A so that the largest amount of memory ever used is the size of A plus the size of I when A is largest. Creating a new variable E would take up more memory.

Although I hope that by setting the first line equal to A, matlab isn't making more memory available because of the reduced size of A, and then reallocating that memory to A in the next line, since that seems it would take a lot of time.

==========

"Matt Fig" <spamanon@yahoo.com> wrote in message >

> Juliette,
> I think we may need to know more about what goes on with B in order to help
> optimize your code.

for J=1:1000
A=reshape(bsxfun(@times,reshape(sum(reshape(A,[],N^2),2),1,[]),I),N^2,[]);
B(diagonals,J+1)=sum(A(diagonals,:),2);
end

where,
diagonals=diag(reshape(1:N^2,N,N));


> Do the values in A tend asymptotically towards a value, or
> oscillate about some value?

Hmm...
In some cases they oscillate around a value.
In other cases they appear to have periodic behaviour, but not oscillating around a value (you can imagine a bunch of local maxima and minima occurring periodically, but where the maxima and minima both decrease after each period), and then eventually converge to an asymptote.

But I usually don't let the forloop go long enough to see it converge to the asymptote.
Does knowing the long term behavior of A allow for its calculation to be sped up ??



Thanks again to everyone,
Juliette

Subject: split one line computation without duplicating data

From: Andy

Date: 16 Aug, 2010 12:59:11

Message: 16 of 16

Well, an obvious area for (small) improvement:

% re: diagonals
N=1e4;
tic;
A=reshape(1:N^2,N,N); % your method
A=diag(A);
toc

tic
B=1:N+1:N^2; % much faster
toc

isequal(A',B) % note the transpose

% displays:
Elapsed time is 0.749404 seconds.
Elapsed time is 0.000084 seconds.
ans =
     1


Could you give a small sample of input A, B, and I?

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us