Parfor loops indexing into table rows
56 views (last 30 days)
Show older comments
Andrew McCauley
on 20 Jul 2022
Commented: Bruno Luong
on 20 Jul 2022
Typically the most time-consuming part of my data analysis can be boiled down to "do thing to row of table for all rows of table", so it seemed pretty ideal for parfor looping (and is) but I'm wondering if there is a better way than the workaround I've been using.
Indexing seems problematic - my usual approach to table indexing is table.columnname(row) but this leads to an error: "Error: Unable to classify the variable 'tableParfor' in the body of the parfor-loop. For more information, see Parallel for Loops in MATLAB, "Solve Variable Classification Issues in parfor-Loops"."
The same thing happens if I try table{row, columnname}, and as far as I can tell from the docs on tables I'm kinda out of options for normal indexing at this point.
I assumed my usual approach failed because this page says that indexing in the form of a.b(c) fails:
Variable A on the left is not sliced; variable A on the right is sliced:
A.q(i,12) A(i,12).q
But the right side indexing is not valid for tables. I'm not really sure why table{row, column} doesn't work. But I did find a workaround (making a temporary one-row table and always indexing into that) that does work but seems suboptimal. Still cuts down on time for a lot of my scripts but I still think
If anyone can shed some light or improve this code, I've made a simplified version of what my actual scripts generally do with parfor loops.
tableParfor = table('Size', [100 4], 'VariableTypes', {'double', 'double', 'double', 'double'}, 'VariableNames', {'first', 'second', 'third', 'final'});
for rows = 1:100
for columns = 1:3
tableParfor.(columns)(rows) = rand(1);
end
end
a=1.5;
b=2.6;
c=6.4;
%random broadcast variables
parfor cT = 1:height(tableParfor)
% tableParfor.final(cT)=a*tableParfor.first(cT) + b*tableParfor.second(cT) + c*tableParfor.third(cT);
% my usual syntax, this doesn't work with parfor
% tableParfor{cT, 'final'}=a*tableParfor{cT, 'first'} + b*tableParfor{cT, 'second'} + c*tableParfor{cT, 'third'};
% alternative syntax, this doesn't work with parfor
% tableParfor(cT).final=a*tableParfor(cT).first + b*tableParfor(cT).second + c*tableParfor(cT).third;
% my attempt to get something like what the docs recommend, but is invalid syntax for tables
rowTable = tableParfor(cT, :);
rowTable.final = a*rowTable.first + b*rowTable.second + c*rowTable.third;
tableParfor(cT, :) = rowTable;
% this workaround works, but adds two extra lines to the code and I think the extra creation of rowTable for each worker chews up memory
end
0 Comments
Accepted Answer
Edric Ellis
on 20 Jul 2022
There's a few things conspiring against you here. Firstly, parfor analysis doesn't understand how to "slice" table data using variable names, but you can use variable indices, i.e. tableParfor{cT,4} = ... is allowed.
Secondly, you're trying to use tableParfor as a "sliced input/output", which further constrains what you're allowed to do - in particular the "fixed form of indexing" constraint stops you accessing different variables of your sliced row directly.
Your workaround (extract a slice, operate, put it back) would be my first choice, despite its awkwardness. The following is almost certainly a worse option since it duplicates and then broadcasts the input data table, but it does work:
inTable = tableParfor;
parfor cT = 1:height(tableParfor)
tableParfor{cT, 4}=a*inTable{cT, 'first'} + b*inTable{cT, 'second'} + c*inTable{cT, 'third'};
end
Note that in that example, inTable gets broadcast, and so all indexing restrictions are removed, and I can use the variable-name indexing.
4 Comments
More Answers (1)
Bruno Luong
on 20 Jul 2022
Edited: Bruno Luong
on 20 Jul 2022
This works, but I'm not sure is what you want.
IMO table is not well-suited data structure to do calculation. Simple raw numerical array is.
EDIT corrrect typos
tableParfor = table('Size', [100 4], 'VariableTypes', {'double', 'double', 'double', 'double'}, 'VariableNames', {'first', 'second', 'third', 'final'});
for rows = 1:100
for columns = 1:3
tableParfor.(columns)(rows) = rand(1);
end
end
a=1.5;
b=2.6;
c=6.4;
%random broadcast variables
for cT = 1:height(tableParfor)
rowTable = tableParfor{cT, :};
rowTable(4) = a*rowTable(1) + b*rowTable(2) + c*rowTable(3);
tableParfor(cT,:) = num2cell(rowTable);
% this workaround works, but adds two extra lines to the code and I think the extra creation of rowTable for each worker chews up memory
end
2 Comments
Bruno Luong
on 20 Jul 2022
The table is a beast of OOP class with all kinds of overloaed indexing. You are already lucky to be able to allow to extract rows in parallel as slice data.
See Also
Categories
Find more on Parallel for-Loops (parfor) in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!