Can you have a multilevel table?

I have a large table that has several groups of similar variables. I'd like to create levels of variables so I can group and access them easily- perhaps I need a struct or perhaps I am being lazy.
load patients
T1 = table(LastName,Gender,Age,Height,Weight,Systolic,Diastolic);
head(T1,3)
ans = 3×7 table
LastName Gender Age Height Weight Systolic Diastolic ____________ __________ ___ ______ ______ ________ _________ {'Smith' } {'Male' } 38 71 176 124 93 {'Johnson' } {'Male' } 43 69 163 109 77 {'Williams'} {'Female'} 38 64 131 125 83
I can merge variables, but I seem to lose the nested variable names.
T2 = mergevars(T1,{'Systolic','Diastolic'},'NewVariableName','BloodPressure');
head(T2,3)
ans = 3×6 table
LastName Gender Age Height Weight BloodPressure ____________ __________ ___ ______ ______ _____________ {'Smith' } {'Male' } 38 71 176 124 93 {'Johnson' } {'Male' } 43 69 163 109 77 {'Williams'} {'Female'} 38 64 131 125 83
I'd like to be able to access the Systolic and Diastolic variable under BloodPressure like this:
T2.BloodPressure.Systolic
%but not this
T2.BloodPressure(:,1)
It's a lot harder for me to keep track of the index, and I am hoping I would be able to use Tab Completion.
Also, in the event there is more than one 'Blood Pressure', say I have BloodPressure1 and BloodPressure2 both with sub variables Systolic and DIastolic, I'd like to be able to get all the Systolics at once- something like
T2.{:}.Systolic
Hope that explains what I'm after, and hoping there's a way to get there. Thanks.

 Accepted Answer

You can use 'MergeAsTable',true in mergevars:
load patients
T1 = table(LastName,Gender,Age,Height,Weight,Systolic,Diastolic);
head(T1,3)
ans = 3×7 table
LastName Gender Age Height Weight Systolic Diastolic ____________ __________ ___ ______ ______ ________ _________ {'Smith' } {'Male' } 38 71 176 124 93 {'Johnson' } {'Male' } 43 69 163 109 77 {'Williams'} {'Female'} 38 64 131 125 83
T2 = mergevars(T1,{'Systolic','Diastolic'}, ...
'NewVariableName','BloodPressure', ...
'MergeAsTable',true);
head(T2,3)
ans = 3×6 table
LastName Gender Age Height Weight BloodPressure Systolic Diastolic ____________ __________ ___ ______ ______ _____________________ {'Smith' } {'Male' } 38 71 176 124 93 {'Johnson' } {'Male' } 43 69 163 109 77 {'Williams'} {'Female'} 38 64 131 125 83
T2.BloodPressure
ans = 100×2 table
Systolic Diastolic ________ _________ 124 93 109 77 125 83 117 75 122 80 121 70 130 88 115 82 115 78 118 86 114 77 115 68 127 74 130 95 114 79 130 92
T2.BloodPressure.Systolic
ans = 100×1
124 109 125 117 122 121 130 115 115 118

4 Comments

Thanks! Is there a way to grab all lower level variables that match? Like this:
load patients
T1 = table(LastName,Gender,Age,Height,Weight,Systolic,Diastolic);
T2 = head(mergevars(T1,{'Systolic','Diastolic'}, ...
'NewVariableName','BloodPressure', ...
'MergeAsTable',true),3)
T2 = 3×6 table
LastName Gender Age Height Weight BloodPressure Systolic Diastolic ____________ __________ ___ ______ ______ _____________________ {'Smith' } {'Male' } 38 71 176 124 93 {'Johnson' } {'Male' } 43 69 163 109 77 {'Williams'} {'Female'} 38 64 131 125 83
T3 = renamevars(T2,{'Height','Weight'},{'Systolic','Diastolic'})
T3 = 3×6 table
LastName Gender Age Systolic Diastolic BloodPressure Systolic Diastolic ____________ __________ ___ ________ _________ _____________________ {'Smith' } {'Male' } 38 71 176 124 93 {'Johnson' } {'Male' } 43 69 163 109 77 {'Williams'} {'Female'} 38 64 131 125 83
T3=mergevars(T3,{'Systolic','Diastolic'}, ...
'NewVariableName','BloodPressure2', ...
'MergeAsTable',true)
T3 = 3×5 table
LastName Gender Age BloodPressure2 BloodPressure Systolic Diastolic Systolic Diastolic ____________ __________ ___ _____________________ _____________________ {'Smith' } {'Male' } 38 71 176 124 93 {'Johnson' } {'Male' } 43 69 163 109 77 {'Williams'} {'Female'} 38 64 131 125 83
% I'd like to be able to access all Systolic, but can't figure it out:
T3.{'BloodPressure2','BloodPressure'}.Systolic
T3.{:}.Systolic
load patients
T1 = table(LastName,Gender,Age,Height,Weight,Systolic,Diastolic);
T2 = head(mergevars(T1,{'Systolic','Diastolic'}, ...
'NewVariableName','BloodPressure', ...
'MergeAsTable',true),3);
T3 = renamevars(T2,{'Height','Weight'},{'Systolic','Diastolic'});
T3=mergevars(T3,{'Systolic','Diastolic'}, ...
'NewVariableName','BloodPressure2', ...
'MergeAsTable',true)
T3 = 3×5 table
LastName Gender Age BloodPressure2 BloodPressure Systolic Diastolic Systolic Diastolic ____________ __________ ___ _____________________ _____________________ {'Smith' } {'Male' } 38 71 176 124 93 {'Johnson' } {'Male' } 43 69 163 109 77 {'Williams'} {'Female'} 38 64 131 125 83
% I'd like to be able to access all Systolic:
[T3.BloodPressure2.Systolic T3.BloodPressure.Systolic]
ans = 3×2
71 124 69 109 64 125
Thanks again. Is there any way to wildcard the 'BloodPressure' variable so that any and all Systolic subvariables are found?
Starting in R2022a you can use patterns to index into tables (as well as in a number of table methods that accept table variable indices).
load patients
T1 = table(LastName,Gender,Age,Height,Weight,Systolic,Diastolic);
T2 = head(mergevars(T1,{'Systolic','Diastolic'}, ...
'NewVariableName','BloodPressure', ...
'MergeAsTable',true),3);
T3 = renamevars(T2,{'Height','Weight'},{'Systolic','Diastolic'});
T3 = mergevars(T3,{'Systolic','Diastolic'}, ...
'NewVariableName','BloodPressure2', ...
'MergeAsTable',true)
T3 = 3×5 table
LastName Gender Age BloodPressure2 BloodPressure Systolic Diastolic Systolic Diastolic ____________ __________ ___ _____________________ _____________________ {'Smith' } {'Male' } 38 71 176 124 93 {'Johnson' } {'Male' } 43 69 163 109 77 {'Williams'} {'Female'} 38 64 131 125 83
bp = T3(:, "BloodPressure"+wildcardPattern)
bp = 3×2 table
BloodPressure2 BloodPressure Systolic Diastolic Systolic Diastolic _____________________ _____________________ 71 176 124 93 69 163 109 77 64 131 125 83
bpSystolic = varfun(@(t) t(:, "Systolic"), bp)
bpSystolic = 3×2 table
Fun_BloodPressure2 Fun_BloodPressure Systolic Systolic __________________ _________________ 71 124 69 109 64 125
bpSystolic.Properties.VariableNames = bp.Properties.VariableNames
bpSystolic = 3×2 table
BloodPressure2 BloodPressure Systolic Systolic ______________ _____________ 71 124 69 109 64 125

Sign in to comment.

More Answers (1)

An alternative to @Voss's execellent answer is to create the nested table(s) and then add the nested table(s) as a table variable to the main table (MATLAB R2018b or later).
load patients
BloodPressure = table(Systolic,Diastolic);
T1 = table(LastName,Gender,Age,Height,Weight, BloodPressure)
T1 = 100×6 table
LastName Gender Age Height Weight BloodPressure Systolic Diastolic ____________ __________ ___ ______ ______ _____________________ {'Smith' } {'Male' } 38 71 176 124 93 {'Johnson' } {'Male' } 43 69 163 109 77 {'Williams'} {'Female'} 38 64 131 125 83 {'Jones' } {'Female'} 40 67 133 117 75 {'Brown' } {'Female'} 49 64 119 122 80 {'Davis' } {'Female'} 46 68 142 121 70 {'Miller' } {'Female'} 33 64 142 130 88 {'Wilson' } {'Male' } 40 68 180 115 82 {'Moore' } {'Male' } 28 68 183 115 78 {'Taylor' } {'Female'} 31 66 132 118 86 {'Anderson'} {'Female'} 45 68 128 114 77 {'Thomas' } {'Female'} 42 66 137 115 68 {'Jackson' } {'Male' } 25 71 174 127 74 {'White' } {'Male' } 39 72 202 130 95 {'Harris' } {'Female'} 36 65 129 114 79 {'Martin' } {'Male' } 48 71 181 130 92

3 Comments

I really like nested tables, but how can I run groupsummary on them? Without having the grouping functions supporting nested tables I don't see any advantages..
load patients
BloodPressure = table(Systolic,Diastolic);
T1 = table(LastName,Gender,Age,Height,Weight, BloodPressure)
T1 = 100×6 table
LastName Gender Age Height Weight BloodPressure Systolic Diastolic ____________ __________ ___ ______ ______ _____________________ {'Smith' } {'Male' } 38 71 176 124 93 {'Johnson' } {'Male' } 43 69 163 109 77 {'Williams'} {'Female'} 38 64 131 125 83 {'Jones' } {'Female'} 40 67 133 117 75 {'Brown' } {'Female'} 49 64 119 122 80 {'Davis' } {'Female'} 46 68 142 121 70 {'Miller' } {'Female'} 33 64 142 130 88 {'Wilson' } {'Male' } 40 68 180 115 82 {'Moore' } {'Male' } 28 68 183 115 78 {'Taylor' } {'Female'} 31 66 132 118 86 {'Anderson'} {'Female'} 45 68 128 114 77 {'Thomas' } {'Female'} 42 66 137 115 68 {'Jackson' } {'Male' } 25 71 174 127 74 {'White' } {'Male' } 39 72 202 130 95 {'Harris' } {'Female'} 36 65 129 114 79 {'Martin' } {'Male' } 48 71 181 130 92
groupsummary(T1,"Gender",@(x)mean(x.Systolic),"BloodPressure")
Error using matlab.internal.math.checkDataVariables
Invalid data variables.

Error in groupsummary (line 247)
dataVars = matlab.internal.math.checkDataVariables(T, varargin{indStart}, 'groupsummary', 'Data');
> I really like nested tables, but how can I run groupsummary on them?
@Jan Kappen, you can un-nest them to compute group summaries using splitvars.
Demo:
t3 = table(randi(2,5,1),rand(5,1),'VariableNames',{'a','b'});
t2 = table(rand(5,1),rand(5,1),t3,t3,'VariableNames',{'c','d','t3','t4'})
t2 = 5×4 table
c d t3 t4 a b a b _______ ________ _____________ _____________ 0.64143 0.087442 1 0.1543 1 0.1543 0.10796 0.35111 2 0.86203 2 0.86203 0.14392 0.91985 1 0.99177 1 0.99177 0.15729 0.39593 2 0.085138 2 0.085138 0.41819 0.1901 1 0.8832 1 0.8832
t1 = splitvars(t2)
t1 = 5×6 table
c d t3_a t3_b t4_a t4_b _______ ________ ____ ________ ____ ________ 0.64143 0.087442 1 0.1543 1 0.1543 0.10796 0.35111 2 0.86203 2 0.86203 0.14392 0.91985 1 0.99177 1 0.99177 0.15729 0.39593 2 0.085138 2 0.085138 0.41819 0.1901 1 0.8832 1 0.8832
groupsummary(t1,'t3_a', 'mean', 't3_b')
ans = 2×3 table
t3_a GroupCount mean_t3_b ____ __________ _________ 1 3 0.67642 2 2 0.47358
@Adam Danz that's a very nice information, thx.

Sign in to comment.

Categories

Tags

Asked:

on 25 May 2022

Commented:

on 7 Feb 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!