MATLAB Answers

most frequent word in cell array

4 views (last 30 days)
Hi, I have a cell array "P" of size 2000 by 20. Each cell value is either "Yes" or "No". How can I make a new cell array "vote" of size 2000 by 1 that each cell contains the most frequent word of each row in P?

Accepted Answer

Walter Roberson
Walter Roberson on 25 Oct 2017
tf = ismember(lower(P), 'yes');
votes = sum(tf, 2);
  4 Comments
Walter Roberson
Walter Roberson on 26 Oct 2017
Right, but I had overlooked that the question asked about the most common entry -- which can be found by testing the count against width/2

Sign in to comment.

More Answers (2)

dpb
dpb on 25 Oct 2017
Edited: dpb on 25 Oct 2017
Good place to use categorical variables instead of the cellstr...
Example:
>> yn={'yes' 'no' 'Yes';'no' 'No', 'NO'}; % minimal dataset including capitaliztion differences
>> ync=categorical(lower(yn)); % convert to categorical and normalize spelling
>> cnts=countcats(ync,2) % count responses on 2nd dimension
cnts =
1 2
3 0
>> vote=cnts(:,2)>cnts(:,1); % see which is greater (Y>N --> True)
>> vote=categorical(vote,[true false],{'Yes','No'}) % convert to categorical to display
vote =
Yes
No
>> yn % original table to compare -- looks like right choice.
yn =
'yes' 'no' 'Yes'
'no' 'No' 'NO'
>>
NB: The above doesn't have the extra logic to check for tie--in case that is possible will need to test for == as well and add the third category of TIE as possible output.
ADDENDUM
If TIE is possible, look at computing difference between counts and then the SIGN function will generate the tri-state variable needed.

Sarah Palfreyman
Sarah Palfreyman on 30 Apr 2018
Try tokenizing with Text Analytics Toolbox and you can easily get a histogram count.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!