Path: news.mathworks.com!newsfeed-00.mathworks.com!newscon06.news.prodigy.com!prodigy.net!news.maxwell.syr.edu!news-rtr.nyroc.rr.com!news-out.nyroc.rr.com!twister.nyroc.rr.com.POSTED!53ab2750!not-for-mail From: John D'Errico <woodchips@rochester.rr.com> Newsgroups: comp.soft-sys.matlab Mail-Copies-To: nobody Subject: Re: growing array in Matlab? Organization: I'm not really very organized References: <ddvkd9$7ar$1@ruby.cit.cornell.edu> User-Agent: MT-NewsWatcher/3.4 (PPC Mac OS X) Message-ID: <woodchips-EBB412.12532017082005@syrcnyrdrs-02-ge0.nyroc.rr.com> Lines: 128 Date: Wed, 17 Aug 2005 16:53:20 GMT NNTP-Posting-Host: 66.66.154.144 X-Complaints-To: abuse@rr.com X-Trace: twister.nyroc.rr.com 1124297600 66.66.154.144 (Wed, 17 Aug 2005 12:53:20 EDT) NNTP-Posting-Date: Wed, 17 Aug 2005 12:53:20 EDT Xref: news.mathworks.com comp.soft-sys.matlab:296450 In article <ddvkd9$7ar$1@ruby.cit.cornell.edu>, "Stephen Vavasis" <vavasis@cs.cornell.edu> wrote: > A well-known issue with Matlab is that it performs badly if you repeatedly > add elements to the end of an array like this: > > a = []; > while <condition> > a = [a; sin(x)]; > end > > (run time is quadratic -- see the timings at the end of this email). If > you know in advance how large a should be, then you can simply declare it > before the loop: a = zeros(total_a_size,1). But if you don't know the size, > you can use the following technique to implement arrays that grow > exponentially: (snip) > function a = testappend2(n) > numtrial = 4; > avetime = 0; > for j = 1 : numtrial > tic > a = []; > a_size = 0; > for j = 1 : n > a_size = j; > if length(a) < a_size > a = [a;zeros(max(a_size-length(a),floor(length(a)*.4))+4,1)]; > a(j) = sin(j); > end > end > a = a(1:a_size); > avetime = avetime + toc; > end > avetime = avetime / numtrial I will note that testappend2 has a couple of serious bugs in it. You should never use the same loop variable in nested loops! You would see the problem by observing that the two functions (testappend and testappend2) have remarkably different outputs. This might be a better test code: function a = testappend2(n) numtrial = 4; avetime = 0; for i = 1 : numtrial tic a = []; a_size = 0; for j = 1 : n a_size = j; if length(a) < a_size a = [a;zeros(max(a_size-length(a),floor(length(a)*.4))+4,1)]; end a(j) = sin(j); end a = a(1:a_size); avetime = avetime + toc; end avetime = avetime / numtrial The timings that you provide are not valid due to those bugs. You might look at the function grow_array. I wrote it some time ago in an attempt to speed up just this process, allowing the user to append new chunks more efficiently. I had also used a block allocation scheme much like yours for a while, but since some of my codes had to append many hundreds of thousands of times I found it also inefficient. http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId= 7490&objectType=file I used cell arrays in an attempt to speed up the concatenation process. Note the reviewer's comment however. Simple use of a cell array followed by concatenation works better by far. Clearly there was too much overhead in my code compared to the utter simplicity of a cell array. Note that the code below also works quite well, as long as n is not too large. function a = testappend3(n) numtrial = 4; avetime = 0; for j = 1 : numtrial tic a = {}; for j = 1 : n a{j} = j; end a = cat(1,a{:}); avetime = avetime + toc; end avetime = avetime / numtrial Try testappend3(4000) - it is fairly efficient. Now try testappend3(40000). I assume that concatenation of 40K cells is the problem here. As far as the block allocation scheme, the problem here is handling millions of append operations. You need to watch out for your array growing so large enough that it begins to eat up a serious amount of ram. Once you start to move into virtual memory, all bets are off. Every time that you append a slew of zeros, the block append approach needs to re-allocate the entire array. In all seriousness, this can happen. It did to me, which is why, for a while, I dropped the block append of zeros in favor of the approach that grow_array uses.) Grow_array also has its problems, I'd like to revisit it in order to find a code that is optimal for any problem size. John -- The best material model of a cat is another, or preferably the same, cat. A. Rosenblueth, Philosophy of Science, 1945