MATLAB Answers

0

How to preallocate tables using the table-function???

Asked by Zwithouta on 17 Apr 2018
Latest activity Commented on by Philip Borghesani on 8 Nov 2018
Hi Guys,
according to the Matlab documentation for the 'table' function, it should be possible to preallocate tables using the following syntax:
T = table('Size',sz,'VariableTypes',varTypes)
Does anyone know which version of Matlab is required to use this syntax?
When I'm trying to use it, either in Matlab R2015a or R2016b, I'm getting the following error message.
t = table('Size', [2 2], 'VariableTypes', {'double', 'string'})
Error using table (line 281)
Invalid parameter name: Size.
Caused by:
You may have intended to create a table with one row from one or more variables that are character strings. Consider using cell arrays of strings rather
than character arrays. Alternatively, create a cell array with one row, and convert that to a table using CELL2TABLE.

  0 Comments

Sign in to comment.

1 Answer

Answer by Philip Borghesani on 17 Apr 2018
 Accepted Answer

That syntax requires MATLAB R2018a. In most situations there is no need to pre-allocate a table. Just construct it from efficiently created columns of data. If we knew more about where your data was coming from we might suggest alternative allocation/creation methods.

Pre-allocation of complex MATLAB structures and objects causes much confusion and is often not needed. Pre-allocate simple things (like numeric matrices or the top level of a cell array structure). Compound objects contain any number of individual contiguous simple data structures and do not need pre-allocation in general.

Read this answer discussing object pre-allocation for way to much (most of it good and correct) information on pre-allocation.

  4 Comments

Show 1 older comment
Although you may think your third option is more readable it is quite slow. I fixed the code (and made the size a bit smaller) here are my suggestions and times:
% Readable but needs three separate variables
sz=10000;
tic
[a, b, c] = deal(zeros(sz,1));
for i = 1:sz
a(i,1) = randi(10);
b(i,1) = randi(10);
c(i,1) = randi(10);
end
t = table(a, b, c);
toc
% Less readable but needs only one variable
tic
z = zeros(sz,3);
for i = 1:sz
z(i,1) = randi(10);
z(i,2) = randi(10);
z(i,3) = randi(10);
end
t = table(z(:,1), z(:,2), z(:,3), 'VariableNames', {'a', 'b', 'c'});
toc
% readable & needs only one variable
tic
t = table('Size', [sz,3], 'VariableTypes', ...
{'double', 'double', 'double'},'VariableNames', {'a', 'b', 'c'});
for i = 1:sz
t.a(i,1) = randi(10);
t.b(i,1) = randi(10);
t.c(i,1) = randi(10);
end
toc
%best code
tic
t = table(randi(10,sz,1),randi(10,sz,1),randi(10,sz,1), 'VariableNames', {'a', 'b', 'c'});
toc
%or
tic
a=randi(10,sz,1);
b=randi(10,sz,1);
c=randi(10,sz,1);
t = table(a,b,c);
toc
%or even
tic
t=table;
t.a=randi(10,sz,1);
t.b=randi(10,sz,1);
t.c=randi(10,sz,1);
toc
Note that the 'preallocated' version is much slower then all other options. I prefer the last codeing stile.
>> ttable
Trial>> ttable
Elapsed time is 0.023312 seconds.
Elapsed time is 0.018329 seconds.
Elapsed time is 4.152630 seconds.
Elapsed time is 0.001452 seconds.
Elapsed time is 0.002160 seconds.
Elapsed time is 0.002522 seconds.
Philip, I don't think the preallocation is the time hog in your example above. Here's what I tried with r2018b:
sz=10000;
fprintf('\n 1. preallocate table: \n')
tic
t1 = table('Size', [sz,3], 'VariableTypes', ...
{'double', 'double', 'double'}, ...
'VariableNames', {'a1', 'b1', 'c1'});
toc
fprintf('then loop through assignments... call randi %d times: \n', 3*sz)
tic
for i = 1:sz
t1.a1(i,1) = randi(10);
t1.b1(i,1) = randi(10);
t1.c1(i,1) = randi(10);
end
toc
fprintf('\n 2. preallocate table: \n')
tic
t2 = table('Size', [sz,3], 'VariableTypes', ...
{'double', 'double', 'double'},...
'VariableNames', {'a2', 'b2', 'c2'});
toc
fprintf('then do assignments without looping, calling randi 3 times: \n')
tic
t2.a2=randi(10,sz,1);
t2.b2=randi(10,sz,1);
t2.c2=randi(10,sz,1);
toc
fprintf('size(t1): %d\n',size(t1))
fprintf('size(t2): %d\n',size(t2))
And my results:
>> clear
>> test
1. preallocate table:
Elapsed time is 0.001445 seconds.
then loop through assignments... call randi 30000 times:
Elapsed time is 2.877081 seconds.
2. preallocate table:
Elapsed time is 0.001075 seconds.
then do assignments without looping, calling randi 3 times:
Elapsed time is 0.000760 seconds.
size(t1): 10000
size(t1): 3
size(t2): 10000
size(t2): 3
>>
I produced different tables to avoid reusing variable names, then showed size to ensure they're equivalent.
You are correct that the preallocation is not particularly slow,however it is wasted effort in the second example. The assignments completely replace the initially preallocated memory voiding the preallocation. There is no advantage to preallocating over just assigning the variables.
fprintf('\n 3. No preallocation of table : \n')
tic
t3 = table;
toc
fprintf('then do assignments without looping, calling randi 3 times: \n')
tic
t3.a2=randi(10,sz,1);
t3.b2=randi(10,sz,1);
t3.c2=randi(10,sz,1);
toc
2. preallocate table:
Elapsed time is 0.002449 seconds.
then do assignments without looping, calling randi 3 times:
Elapsed time is 0.002245 seconds.
3. No preallocation of table :
Elapsed time is 0.000350 seconds.
then do assignments without looping, calling randi 3 times:
Elapsed time is 0.002473 seconds.

Sign in to comment.