# Sorting Data into a Table

1 view (last 30 days)
Yasmeen on 6 Oct 2022
Commented: dpb on 7 Oct 2022
Hello,
I have a dataset that consists of a single column with x amount of rows (only containing, 1, 2, 3s). I want to take this column and form a table and sort the 1s, 2s, and 3s into different columns. I know I will likely need if statements but not sure where to start. I want to form a table with 3 columns and the 1s sorted into one columns, 2s in another, and so on. Any help would be greatly appreciated!
dpb on 7 Oct 2022
Edited: dpb on 7 Oct 2022
No! You did NOT understand @Walter Roberson's Q? -- what you show is what you CANNOT do in a MATLAB table or ordinary array; MATLAB is NOT a spreadsheet and does not support "ragged" arrays (except by cell arrays which can hold anything in a given cell) and even there the cell array itself must be rectangular.
In one variable other than a cell array, you can only simulate the result by padding each column to the same length with NaN or some other indicator value.

dpb on 7 Oct 2022
Edited: dpb on 7 Oct 2022
% make up some data
N=randi(5,3,1);
x=[];for i=1:numel(N),x=[x;i*ones(N(i),1)];end
x=x(randperm(numel(x)))
x = 7×1
3 1 1 3 2 1 1
% the engine (cell array solution)
[u,~,ic]=unique(x);
for i=1:numel(u)
C{i}=x(ic==u(i));
end
C{:}
ans = 4×1
1 1 1 1
ans = 2
ans = 2×1
3 3
clear C
N=max(histc(ic,u)); % max number of any value
C=nan(N,numel(u)); % preallocate 2D array prefilled to NaN for missing
for i=1:numel(u)
ix=(ic==u(i));
C(1:sum(ix),i)=x(ix);
end
C
C = 4×3
1 2 3 1 NaN 3 1 NaN NaN 1 NaN NaN
% convert to table
tC=array2table(C,'variablenames',{'1:REM','2:NREM','3:WAKE'})
tC = 4×3 table
1:REM 2:NREM 3:WAKE _____ ______ ______ 1 2 3 1 NaN 3 1 NaN NaN 1 NaN NaN
You don't give enough information to know what is end result here, but one kinda' presumes there is going to be an analysis of these data by the above variable values. If so, probably the best way to store the data is in a table with the conditions as above as variables -- turning the 1:3 vector into a categorical variable with those names as the categories would likely be a good first step.
Then, analyses could follow with groupsummary or the other tools for handling grouping variables in tables. Look up the background information on processing data in tables; particularly the "split apply" workflow; it'll be eye opening of capabilities.
dpb on 7 Oct 2022
Carrying on from the above observation -- given the names associated with the various indices, one presumes the multiple instances of each are associated with different subjects and/or monitored sleep periods and there would be various other observations associated with each.
In that case it would make much more sense for analyses to do something more like
% make up some data
N=randi(5,3,1);
x=[];for i=1:numel(N),x=[x;i*ones(N(i),1)];end
N=numel(x);
x=x(randperm(N));
STAGES=categorical({'REM','NREM','WAKE'});
stage=STAGES(x).';
id=categorical("Subject"+randi(5,numel(x),1));
tStart=timeofday(datetime(0,0,0,randi([20 23],N,1),randi([0 59],N,1),0));
tEnd=tStart+minutes(randi([60 8*60],N,1));
tData=table(id,stage,tStart,tEnd,tEnd-tStart,'variablenames',{'Subject','Stage','Initiate','Terminate','Duration'})
tData = 9×5 table
Subject Stage Initiate Terminate Duration ________ _____ ________ _________ ________ Subject3 NREM 21:23:00 26:23:00 05:00:00 Subject3 WAKE 20:05:00 25:06:00 05:01:00 Subject1 WAKE 23:19:00 27:41:00 04:22:00 Subject3 REM 23:18:00 26:26:00 03:08:00 Subject1 NREM 21:09:00 25:35:00 04:26:00 Subject3 WAKE 23:19:00 29:04:00 05:45:00 Subject5 REM 20:15:00 22:24:00 02:09:00 Subject3 NREM 22:24:00 24:42:00 02:18:00 Subject2 NREM 22:36:00 30:24:00 07:48:00
There could, of course, be all kinds of other associated data as additiona fields...but, the beauty of such an arrangement is in being able to do things like
groupsummary(tData,{'Subject'})
ans = 4×2 table
Subject GroupCount ________ __________ Subject1 2 Subject2 1 Subject3 5 Subject5 1
groupsummary(tData,{'Subject','Stage'},@mean,{'Initiate','Terminate','Duration'})
ans = 7×6 table
Subject Stage GroupCount fun1_Initiate fun1_Terminate fun1_Duration ________ _____ __________ _____________ ______________ _____________ Subject1 NREM 1 21:09:00 25:35:00 04:26:00 Subject1 WAKE 1 23:19:00 27:41:00 04:22:00 Subject2 NREM 1 22:36:00 30:24:00 07:48:00 Subject3 NREM 2 21:53:30 25:32:30 03:39:00 Subject3 REM 1 23:18:00 26:26:00 03:08:00 Subject3 WAKE 2 21:42:00 27:05:00 05:23:00 Subject5 REM 1 20:15:00 22:24:00 02:09:00