Path: news.mathworks.com!newsfeed-00.mathworks.com!newscon06.news.prodigy.com!prodigy.net!news.maxwell.syr.edu!news-rtr.nyroc.rr.com!news-out.nyroc.rr.com!twister.nyroc.rr.com.POSTED!53ab2750!not-for-mail
From: John D'Errico <woodchips@rochester.rr.com>
Newsgroups: comp.soft-sys.matlab
Mail-Copies-To: nobody
Subject: Re: growing array in Matlab?
Organization: I'm not really very organized
References: <ddvkd9$7ar$1@ruby.cit.cornell.edu>
User-Agent: MT-NewsWatcher/3.4 (PPC Mac OS X)
Message-ID: <woodchips-EBB412.12532017082005@syrcnyrdrs-02-ge0.nyroc.rr.com>
Lines: 128
Date: Wed, 17 Aug 2005 16:53:20 GMT
NNTP-Posting-Host: 66.66.154.144
X-Complaints-To: abuse@rr.com
X-Trace: twister.nyroc.rr.com 1124297600 66.66.154.144 (Wed, 17 Aug 2005 12:53:20 EDT)
NNTP-Posting-Date: Wed, 17 Aug 2005 12:53:20 EDT
Xref: news.mathworks.com comp.soft-sys.matlab:296450


In article <ddvkd9$7ar$1@ruby.cit.cornell.edu>,
 "Stephen Vavasis" <vavasis@cs.cornell.edu> wrote:

> A well-known issue with Matlab is that it performs badly if you repeatedly 
> add elements to the end of an array like this:
> 
> a = [];
> while <condition>
>   a = [a; sin(x)];
> end
> 
>  (run time is quadratic -- see the timings at the end of this email).  If 
> you know in advance how large a should be, then you can simply declare it 
> before the loop: a = zeros(total_a_size,1).  But if you don't know the size, 
> you can use the following technique to implement arrays that grow 
> exponentially:

(snip)


> function a = testappend2(n)
> numtrial = 4;
> avetime = 0;
> for j = 1 : numtrial
>   tic
>   a = [];
>   a_size = 0;
>   for j = 1 : n
>     a_size = j;
>     if length(a) < a_size
>       a = [a;zeros(max(a_size-length(a),floor(length(a)*.4))+4,1)];
>       a(j) = sin(j);
>     end
>   end
>   a = a(1:a_size);
>   avetime = avetime + toc;
> end
> avetime = avetime / numtrial

I will note that testappend2 has a couple of serious bugs
in it. You should never use the same loop variable in
nested loops! You would see the problem by observing that
the two functions (testappend and testappend2) have
remarkably different outputs.

This might be a better test code:

function a = testappend2(n)
numtrial = 4;
avetime = 0;
for i = 1 : numtrial
  tic
  a = [];
  a_size = 0;
  for j = 1 : n
    a_size = j;
    if length(a) < a_size
      a = [a;zeros(max(a_size-length(a),floor(length(a)*.4))+4,1)];
    end
    a(j) = sin(j);
  end
  a = a(1:a_size);
  avetime = avetime + toc;
end
avetime = avetime / numtrial



The timings that you provide are not valid due to
those bugs. You might look at the function grow_array.
I wrote it some time ago in an attempt to speed up just
this process, allowing the user to append new chunks
more efficiently. I had also used a block allocation
scheme much like yours for a while, but since some of
my codes had to append many hundreds of thousands of
times I found it also inefficient.

http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=
7490&objectType=file

I used cell arrays in an attempt to speed up the
concatenation process. Note the reviewer's comment
however. Simple use of a cell array followed by
concatenation works better by far. Clearly there was
too much overhead in my code compared to the utter
simplicity of a cell array. Note that the code below
also works quite well, as long as n is not too large.


function a = testappend3(n)
numtrial = 4;
avetime = 0;
for j = 1 : numtrial
  tic
  a = {};
  for j = 1 : n
    a{j} = j;
  end
  a = cat(1,a{:});
  avetime = avetime + toc;
end
avetime = avetime / numtrial

 
Try testappend3(4000) - it is fairly efficient. Now try
testappend3(40000). I assume that concatenation of 40K
cells is the problem here.

As far as the block allocation scheme, the problem here
is handling millions of append operations. You need to
watch out for your array growing so large enough that
it begins to eat up a serious amount of ram. Once you
start to move into virtual memory, all bets are off.
Every time that you append a slew of zeros, the block
append approach needs to re-allocate the entire array.
In all seriousness, this can happen. It did to me, which
is why, for a while, I dropped the block append of zeros
in favor of the approach that grow_array uses.) Grow_array
also has its problems, I'd like to revisit it in order
to find a code that is optimal for any problem size.

John


-- 
The best material model of a cat is another, or
preferably the same, cat.
A. Rosenblueth, Philosophy of Science, 1945