Sort table based on number of occurrences

I have a table that I'd like to sort based on groups of a few of the columns. My table is made up of cells.
>> T=table({'A','B','C','B','B'}',{1,2,3,4,5}',{'Car','Van','Car','Car','Car'}')
T =
5×3 table
Var1 Var2 Var3
_____ _____ _______
{'A'} {[1]} {'Car'}
{'B'} {[2]} {'Van'}
{'C'} {[3]} {'Car'}
{'B'} {[4]} {'Car'}
{'B'} {[5]} {'Car'}
I'd like to sort by the number of occurences of Var1 and Var3, starting with the Var1- so since 'B' occurs 3 times, and 'Car' occurs twice with 'B', then the solution would be:
Var1 Var2 Var3
_____ _____ _______
{'B'} {[4]} {'Car'}
{'B'} {[5]} {'Car'}
{'B'} {[2]} {'Van'}
{'A'} {[1]} {'Car'}
{'C'} {[3]} {'Car'}
Is there someting like accumarray for strings? I tried groupsummary, but it will combine rows- Var2 is not always unique.

 Accepted Answer

This has been unanswered for some time. Maybe this is still helpful.
The main problem here is cell arrays of (repeated) text and numeric. Don't use those unless you have to, and you probably don't.
>> T = table(categorical({'A','B','C','B','B'}'), ...
[1,2,3,4,5]', ...
categorical({'Car','Van','Car','Car','Car'}'));
>> cats1 = categories(T.Var1);
>> [~,order1] = sort(countcats(T.Var1),'descend');
>> T.Var1 = reordercats(T.Var1,cats1(order1));
>> cats3 = categories(T.Var3);
>> [~,order3] = sort(countcats(T.Var3),'descend');
>> T.Var3 = reordercats(T.Var3,cats3(order3));
>> sortrows(T,["Var1" "Var3"])
ans =
5×3 table
Var1 Var2 Var3
____ ____ ____
B 4 Car
B 5 Car
B 2 Van
A 1 Car
C 3 Car

3 Comments

Thanks! Always helpful! My data does come in that format and I cannot control it directly (SQL server is the source). Turning the strings into categorical variables does seem like it would solve a lot of my problems with this data and beyond.
Sorry to dig this up- the table creadted from my SQL pull has either 'cell' or 'double' as the datatypes. Is there a way to batch convert only the 'cell' types to 'categorical'?
Would there be a reason not to do this? Some of the cells do contain numbers- but those are for identification purposes (like a serial number) and will never be used as 'numbers'.
Thanks.
I fihured this out...
T=convertvars(T,@iscell,'categorical');

Sign in to comment.

More Answers (0)

Categories

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!