An efficient way to create struct arrays
Create a struct arrays where each struct has field names "a," "b," and "c," which store different types of data. What efficient methods do you have to assign values from individual variables "a," "b," and "c" to each struct element? Here are five methods I've provided, listed in order of decreasing efficiency. What do you think?
Create an array of 10,000 structures, each containing each of the elements corresponding to the a,b,c variables.
num = 10000;
a = (1:num)';
b = string(a);
c = rand(3,3,num);
Here are the methods;
%% method1
t1 =tic;
s = struct("a",[], ...
"b",[], ...
"c",[]);
s1 = repmat(s,num,1);
for i = 1:num
s1(i).a = a(i);
s1(i).b = b(i);
s1(i).c = c(:,:,i);
end
t1 = toc(t1);
%% method2
t2 =tic;
for i = num:-1:1
s2(i).a = a(i);
s2(i).b = b(i);
s2(i).c = c(:,:,i);
end
t2 = toc(t2);
%% method3
t3 =tic;
for i = 1:num
s3(i).a = a(i);
s3(i).b = b(i);
s3(i).c = c(:,:,i);
end
t3 = toc(t3);
%% method4
t4 =tic;
ct = permute(c,[3,2,1]);
t = table(a,b,ct);
s4 = table2struct(t);
t4 = toc(t4);
%% method5
t5 =tic;
s5 = struct("a",num2cell(a),...
"b",num2cell(b),...
"c",squeeze(mat2cell(c,3,3,ones(num,1))));
t5 = toc(t5);
%% plot
bar([t1,t2,t3,t4,t5])
xtickformat('method %g')
ylabel("time(second)")
yline(mean([t1,t2,t3,t4,t5]))
7 Comments
Note that it's usually a bad idea to do this. If your data start off as separate arrays, accessing data is usually more efficient if the data remains in this form, though they could be bundled for convenience into a scalar struct,
num = 10000;
a = (1:num)';
b = string(a);
c = rand(3,3,num);
s.a=a;
s.b=b;
s.c=c;
num = 10000;
a = (1:num)';
b = string(a);
c = rand(3,3,num);
%% method0 (avoiding REPMAT)
t0 = tic;
d = cell(1,num);
s = struct("a",d, "b",d, "c",d);
for ii = 1:num
s(ii).a = a(ii);
s(ii).b = b(ii);
s(ii).c = c(:,:,ii);
end
t0 = toc(t0)
%% method6 (avoiding REPMAT and CELL)
t6 = tic;
s6 = createArray(num,1,'struct');
for i = 1:num
s6(i).a = a(i);
s6(i).b = b(i);
s6(i).c = c(:,:,i);
end
t6 = toc(t6)
num = 10000;
a = (1:num).';
b = string(a);
c = rand(3,3,num);
%% method0 (avoiding REPMAT)
tic
d = cell(1,num);
s = struct("a",d, "b",d, "c",d);
for ii = 1:num
s(ii).a = a(ii);
s(ii).b = b(ii);
s(ii).c = c(:,:,ii);
end
toc
%% method1
tic
s = struct("a",[], "b",[], "c",[]);
s = repmat(s,num,1);
for ii = 1:num
s(ii).a = a(ii);
s(ii).b = b(ii);
s(ii).c = c(:,:,ii);
end
toc
That's really interesting, thank you - curious that loops should be faster than methods avoiding loops here. You can get more robust timings with the timeit() builtin, BTW.
I tested it on my local computer with slightly different results, except for stability, but that's generally the trend.
env: MATLAB-Desktop R2024a, win10
Output of time-consuming scenarios for Method 1 to Method 5:
历时 0.056723 秒。
历时 0.044524 秒。
历时 0.056564 秒。
历时 0.119374 秒。
历时 0.171208 秒。