Thread Subject: Can we realy pre-allocate a cell array?

Subject: Can we realy pre-allocate a cell array?

From: Joaquim Luis

Date: 10 Jan, 2008 19:19:02

Message: 1 of 11

Hello,

The following code reproduces a situation where, by
mistake, it was run with a relative high m,n and it took a
lot of time. The profiler didn't indicate any obvious
bottleneck.

celular = cell(m,n);
for i=1:m
    for j=1:n
        celular{i,j} = construct_a_string_var(i,j);
    end
end

I think the problem here is of preallocation. The
instruction
celular = cell(m,n);
is good to shut up MLint but it cannot be of much use
because the bulk memory needed is the one that will be used
by the array cells, and that was not declared.
So my question is, is there an efficient way of pre-
allocating a cell array?

Thanks

Joaquim Luis

Subject: Can we realy pre-allocate a cell array?

From: Steve Eddins

Date: 10 Jan, 2008 19:41:36

Message: 2 of 11

Joaquim Luis wrote:
> Hello,
>
> The following code reproduces a situation where, by
> mistake, it was run with a relative high m,n and it took a
> lot of time. The profiler didn't indicate any obvious
> bottleneck.
>
> celular = cell(m,n);
> for i=1:m
> for j=1:n
> celular{i,j} = construct_a_string_var(i,j);
> end
> end
>
> I think the problem here is of preallocation. The
> instruction
> celular = cell(m,n);
> is good to shut up MLint but it cannot be of much use
> because the bulk memory needed is the one that will be used
> by the array cells, and that was not declared.
> So my question is, is there an efficient way of pre-
> allocating a cell array?
>
> Thanks
>
> Joaquim Luis
>

I don't see anything in your code that suggests a preallocation issue.
Is the total execution time proportional to m*n? Does the time required
to execute construct_a_string_var() depend on the specific values of i
and j?

Steve Eddins
http://blogs.mathworks.com/steve/

Subject: Can we realy pre-allocate a cell array?

From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)

Date: 10 Jan, 2008 19:48:04

Message: 3 of 11

In article <fm5r35$4be$1@fred.mathworks.com>,
Joaquim Luis <jluis@--ualg--.pt> wrote:
>I think the problem here is of preallocation. The
>instruction
>celular = cell(m,n);
>is good to shut up MLint but it cannot be of much use
>because the bulk memory needed is the one that will be used
>by the array cells, and that was not declared.

preallocating the cell array itself saves m*n-1 reallocations
and saves the copying of m*n*(m*n-1)/2 storage units. These
operations can really add up.


>So my question is, is there an efficient way of pre-
>allocating a cell array?

I take it that you mean by that the memory that will be used to
store the cell contents.

Your example had

>for i=1:m
> for j=1:n
> celular{i,j} = construct_a_string_var(i,j);
> end
>end

which implied variable length strings constructed at runtime; in
such a case, Matlab would not be able to predict how much storage to
allocate.

If you knew how big each cell entry was going to be... the naive
answer that springs to mind is to repmat() a representative cell.
However, because of the way that Matlab shares storage, I am not
sure that would help. (But there must be a limit to the way that
Matlab shares storage, as otherwise pre-allocations with
zeros() or ones() would end up sharing the columns of 0's or 1's
and that wouldn't lead to efficient preallocation.)
--
   "No one has the right to destroy another person's belief by
   demanding empirical evidence." -- Ann Landers

Subject: Can we realy pre-allocate a cell array?

From: Doug Schwarz

Date: 10 Jan, 2008 19:52:55

Message: 4 of 11

In article <fm5r35$4be$1@fred.mathworks.com>,
 "Joaquim Luis" <jluis@--ualg--.pt> wrote:

> Hello,
>
> The following code reproduces a situation where, by
> mistake, it was run with a relative high m,n and it took a
> lot of time. The profiler didn't indicate any obvious
> bottleneck.
>
> celular = cell(m,n);
> for i=1:m
> for j=1:n
> celular{i,j} = construct_a_string_var(i,j);
> end
> end
>
> I think the problem here is of preallocation. The
> instruction
> celular = cell(m,n);
> is good to shut up MLint but it cannot be of much use
> because the bulk memory needed is the one that will be used
> by the array cells, and that was not declared.
> So my question is, is there an efficient way of pre-
> allocating a cell array?
>
> Thanks
>
> Joaquim Luis

It seems to me that no matter what you do in your double loop, your
create_a_string function has to allocate the memory for its output
anyway. Putting that into a cell doesn't require any additional memory
allocation. MATLAB should just copy the pointer to the data into the
cell.

Try this and see if it's faster:

  for i=1:m
      for j=1:n
          construct_a_string_var(i,j);
      end
  end

I'm guessing it won't be.

--
Doug Schwarz
dmschwarz&ieee,org
Make obvious changes to get real email address.

Subject: Can we realy pre-allocate a cell array?

From: Joaquim Luis

Date: 10 Jan, 2008 21:56:03

Message: 5 of 11

Steve Eddins <Steve.Eddins@mathworks.com> wrote in message
<fm5sde$le6$1@fred.mathworks.com>...

> I don't see anything in your code that suggests a
preallocation issue.
> Is the total execution time proportional to m*n?

Yes

> Does the time required
> to execute construct_a_string_var() depend on the
specific values of i
> and j?

It shouldn't matter. Code inside the function is all
vectorized and it's basically made of calls to dec2bin and
bin2dec.
 
I suspected a preallocation because Windows task manager
shows a steady increase of the memory used by Matlab during
all the time it takes the function to run.

However, I tried without success to isolate a code extract
that reproduces the problem. I guess I need to investigate
this further more

Nevertheless, it still puzzles me how the preallocation of
a cell array can be done.
When we do this
c=cell(m,n);
we are allocating for the container, not for its contents.

----
Hmm, I just saw Doug's response.
Yes you are correct, it makes no difference.

Thanks

Subject: Can we realy pre-allocate a cell array?

From: Joaquim Luis

Date: 10 Jan, 2008 23:13:02

Message: 6 of 11

OK, I managed to have an example. The try, catch it's only
because this example sometimes breaks the original
algorithm.

If you monitor the memory consumption you'll notice that by
the end (it takes a little bit) the memory uded by the
matlab process starts to raise up.

Joaquim Luis


function testa_aloca(m,n)
if (nargin < 2)
m = 600; n = 600;
end
quadkey = {'0' '1'; '2' '3'};
celular = cell(m,n);
for i=1:m
   for j=1:n
celular{i,j} = getNext('00001', quadkey, i, j);
   end
end

% ---------------------------------------------------
function new_quad = getNext(quad, quadkey, v, h)
N = numel(quad); quad_num = zeros(1, N);
try
H = repmat(' ',1,N);
new_quad = repmat(' ',1,N); % pre-allocations
tmp = dec2bin(quad_num,2);
V = tmp(:,1); H = tmp(:,2);

if (h), H = dec2bin(bin2dec(H')+h, N); end
if (v), V = dec2bin(bin2dec(V')+v, N); end
% Test if we are getting out of +180. If yes, wrap
the exccess it to -180
if (numel(H) > N), H = dec2bin(h-1, N); end
% Test if we are getting out of -180. If yes, wrap
the exccess it to +180
if ( H(1) == '/' )
          H = dec2bin(bin2dec(tmp(:,2)')+(2^N - 1), N);
end

new_tile_bin = [V(:) H(:)];
catch
new_quad = '00001';
end

Subject: Can we realy pre-allocate a cell array?

From: Peter Boettcher

Date: 10 Jan, 2008 23:16:37

Message: 7 of 11

"Joaquim Luis" <jluis@--ualg--.pt> writes:

> Hello,
>
> The following code reproduces a situation where, by
> mistake, it was run with a relative high m,n and it took a
> lot of time. The profiler didn't indicate any obvious
> bottleneck.
>
> celular = cell(m,n);
> for i=1:m
> for j=1:n
> celular{i,j} = construct_a_string_var(i,j);
> end
> end
>
> I think the problem here is of preallocation. The
> instruction
> celular = cell(m,n);
> is good to shut up MLint but it cannot be of much use
> because the bulk memory needed is the one that will be used
> by the array cells, and that was not declared.
> So my question is, is there an efficient way of pre-
> allocating a cell array?

Pre-allocation is important only in order to avoid dynamically resizing
an array many times. No such dynamic resizing occurs during the storage of your
function output into the cell.

-Peter

Subject: Can we realy pre-allocate a cell array?

From: Peter Boettcher

Date: 10 Jan, 2008 23:21:53

Message: 8 of 11

"Joaquim Luis" <jluis@--ualg--.pt> writes:

> OK, I managed to have an example. The try, catch it's only
> because this example sometimes breaks the original
> algorithm.
>
> If you monitor the memory consumption you'll notice that by
> the end (it takes a little bit) the memory uded by the
> matlab process starts to raise up.
>
> Joaquim Luis
>
>
> function testa_aloca(m,n)
> if (nargin < 2)
> m = 600; n = 600;
> end
> quadkey = {'0' '1'; '2' '3'};
> celular = cell(m,n);
> for i=1:m
> for j=1:n
> celular{i,j} = getNext('00001', quadkey, i, j);
> end
> end

[snip remainder of code]

Well, each result that is stored into celular does indeed take more
memory. So it is not a surprise that memory consumption increases. The
size for each element will be the size of the data, plus the size of an
mxArray header. This used to be 100 bytes, but is probably now somewhat
bigger. So 600^2 * 100 means at least 36 MB in the data structure
itself, not counting the actual data.

But preallocation won't help you here. Each chunk of memory is
allocated exactly once, inside your getNext function. Doing it all at
the beginning would not make any difference.

-Peter

Subject: Can we realy pre-allocate a cell array?

From: Steve Eddins

Date: 11 Jan, 2008 13:47:47

Message: 9 of 11

Joaquim Luis wrote:
> Steve Eddins <Steve.Eddins@mathworks.com> wrote in message
> <fm5sde$le6$1@fred.mathworks.com>...
>
>> I don't see anything in your code that suggests a
> preallocation issue.
>> Is the total execution time proportional to m*n?
>
> Yes

Then what's the problem? You have a loop which executes m*n times, and
you say the total execution time is proportional to m*n. Isn't that
what you'd expect?

Steve Eddins
http://blogs.mathworks.com/steve/

Subject: Can we realy pre-allocate a cell array?

From: Joaquim Luis

Date: 11 Jan, 2008 15:21:03

Message: 10 of 11

Steve Eddins <Steve.Eddins@mathworks.com> wrote in message
<fm7s21$o8v$1@fred.mathworks.com>...
> Joaquim Luis wrote:
> > Steve Eddins <Steve.Eddins@mathworks.com> wrote in
message
> > <fm5sde$le6$1@fred.mathworks.com>...
> >
> >> I don't see anything in your code that suggests a
> > preallocation issue.
> >> Is the total execution time proportional to m*n?
> >
> > Yes
>
> Then what's the problem? You have a loop which executes
m*n times, and
> you say the total execution time is proportional to m*n.
Isn't that
> what you'd expect?

Well, maybe that "Yes" was without much thinking. I mean I
didn't investigate if there was a linear relation between
m*n and time. My main concern came from the fact that,
although I had preallocated, memory consumption still
increased steadily during all execution time. Previous
experiences had thought that is normally due to
preallocations issues. Hence the raising of the
preallocation of cell arrays issue.

J. Luis

Subject: Can we realy pre-allocate a cell array?

From: Steve Eddins

Date: 11 Jan, 2008 16:33:54

Message: 11 of 11

Joaquim Luis wrote:
> Steve Eddins <Steve.Eddins@mathworks.com> wrote in message
> <fm7s21$o8v$1@fred.mathworks.com>...
>> Joaquim Luis wrote:
>>> Steve Eddins <Steve.Eddins@mathworks.com> wrote in
> message
>>> <fm5sde$le6$1@fred.mathworks.com>...
>>>
>>>> I don't see anything in your code that suggests a
>>> preallocation issue.
>>>> Is the total execution time proportional to m*n?
>>> Yes
>> Then what's the problem? You have a loop which executes
> m*n times, and
>> you say the total execution time is proportional to m*n.
> Isn't that
>> what you'd expect?
>
> Well, maybe that "Yes" was without much thinking. I mean I
> didn't investigate if there was a linear relation between
> m*n and time. My main concern came from the fact that,
> although I had preallocated, memory consumption still
> increased steadily during all execution time. Previous
> experiences had thought that is normally due to
> preallocations issues. Hence the raising of the
> preallocation of cell arrays issue.

I see. Your memory use is increasing each time through the loop because
you are creating a new array each time through the loop. That's
natural. There's no preallocation issue in the code you showed us.

Steve Eddins
http://blogs.mathworks.com/steve/

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
preallocation Joaquim Luis 10 Jan, 2008 14:20:05
cell arrays Joaquim Luis 10 Jan, 2008 14:20:05
rssFeed for this Thread

Public Submission Policy

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Disclaimer prior to use.

Contact us at files@mathworks.com