"John D'Errico" <woodchips@rochester.rr.com> wrote in message
news:fui16l$2cr$1@fred.mathworks.com...
> Szabolcs <szhorvat@gmail.com> wrote in message <b2447fe0eb204ecd
> bf76fbc6024fdf7c@f24g2000prh.googlegroups.com>...
>
>> Hi John,
>>
>> Thanks for the reply, but unfortunately I am not sure I understand it
>> completely.
>>
>> This is my first encounter with cell arrays. If I understand it
>> correctly, the only difference between normal arrays and cell arrays
>> is that the latter can hold inhomogeneous data types. So using
>> something like
>>
>> a =3D {}
>> for k=3D1:10
>> a =3D [a 1];
>> end
>>
>> is still slow (and the editor still warns about the need for
>> reallocation).
>>
>> Using
>>
>> a =3D {}
>> for k=3D1:10
>> a =3D {a 1};
>> end
>>
>> appears to build a linked list (so this should be fast), but I could
>> not figure out how to flatten it in the end (cell2mat does not work on
>> this data structure). Is there a builtin way to flatten this data
>> structure, or did I misunderstand this completely?
>>
>> I wrote the a function to flatten lists (attached at the end of the
>> message), and benchmarked the following:
>>
>> tic
>> a =3D {};
>> for k=3D1:10000, a =3D {a, k}; end
>> toc
>> flatten_cell(a);
>> toc
>>
>> tic
>> a =3D [];
>> for k=3D1:10000, a =3D [a k]; end
>> toc
>>
>> It appears that the time taken by the second approach grows roughly as
>> n^2 where n is the length of the list, while the first one is linear
>> (as expected). But the difference in absolute timings is not a great
>> one (i.e. n has to be large for the first approach to be really
>> faster). So this does not seem like a good solution.
>>
>> Also, as I said I am a newbie, so any comments on my code are most
>> welcome (what is it that I'm doing right, where do I take an awkward
>> approach, and what is completely wrong, etc.)
>>
>> Szabolcs
>>
>> function a =3D flatten_cell(c)
>>
>> cc =3D c; % I'm not sure whether this triggers a deep copy ...
>> count =3D 0;
>> while ~isempty(cc)
>> % I'm not sure about the following either,
>> % but if it really did a deep copy, then the complexity
>> % would not be linear ... So most probably it doesn't do it.
>> cc =3D cc{1};
>> count =3D count+1;
>> end
>>
>> cc =3D c;
>> a =3D zeros(1,count);
>> for k=3Dcount:1:1
>> a(k) =3D cc{2};
>> cc =3D cc{1};
>> end
>>
>> end
>
> The basic cell trick is simple. I'll fix it so I
> don't know how many elements I'll end up
> with. It should be roughly 15000 elements.
>
> % simple concatenated growth (quadratic)
> tic
> C = [];
> for i = 1:10000
> C = [C;rand(round(rand(1))+1,2)];
> end
> toc
> Elapsed time is 9.992871 seconds.
Another solution to this type of problem, where you're not sure how long
your array will be to start out with, is to use an upper bound on the number
of elements as the size to which you preallocate and trim the excess at the
end. If you don't overestimate by too much, this can be pretty quick.
clear all
% simple concatenated growth (quadratic)
rand('twister', 5409);
tic
C1 = [];
for i = 1:10000
a = rand(round(rand(1))+1,2);
C1 = [C1; a];
end
toc
% "Overestimate" preallocation
rand('twister', 5409);
tic
% Since each a below can have at most 2 rows, and there are 10000 of them
% we know that the combination of all of them can have no more than 20000
rows
C2 = zeros(20000, 2);
counter = 0;
for k = 1:10000
a = rand(round(rand)+1, 2);
% Assign a to certain rows of C2
C2(counter +(1:size(a, 1)), :) = a;
counter = counter + size(a, 1);
end
C2 = C2(1:counter , :);
toc
isequal(C1, C2)
> % cell growth
> tic
> C = {};
> for i = 1:10000
> C{i,1} = rand(round(rand(1))+1,2);
> end
> C = cell2mat(C);
> toc
> Elapsed time is 2.167827 seconds.
The "overestimate" preallocation was quicker than both simple concatenated
growth and cell growth in R2008a on my Windows XP machine. Of course, in
order to use this, you have to have some knowledge of how your function
behaves to know how much to preallocate ... but ideally you'll have that
knowledge of how you want your function to behave, since you're writing it.
*snip*

Steve Lord
slord@mathworks.com
