Problem 43670. Words Count: A String Array Approach
Given an input character vector consisting of words, punctuation marks, white spaces, and possibly newline characters (\n), arrange the unique words alphabetically in a string array and calculate the histogram count of every unique word.
Assumptions:
- Case is insensitive, e.g., WORDS and words are treated as the same word, and you may return in the output string array either the uppercase or lowercase.
- Punctuation marks are limited only to comma (,), period (.), colon (:), semi-colon (;), question mark (?), and exclamation mark (!).
For example, given the input txt as a character vector,
txt = 'I love MATLAB and Cody, but I don''t like trivial matlab problems on cody.';
the outputs should be
words = string({'and';'but';'Cody';'don''t';'I';'like';'love';'MATLAB';... 'on';'problems';'trivial'}); count = [1; 1; 2; 1; 2; 1; 1; 2; 1; 1; 1];
Hint: The R2016b documentation provides a good example of text data analysis via the string array approach. However, some steps illustrated in that example are unnecessary, and we can indeed accomplish the same task in a simpler way.
Related problems in this series:
- Words Count: A Cell Array Approach
- Words Count: A String Array Approach
Solution Stats
Problem Comments
-
2 Comments
the whole set of 2016b string challenge is just wonderful ! even for today, it is still looks advancing. Thank you for the thoughts and the work.
Thank you for your interest. Cody is already powered by the newest R2018a. It might be be a good time to start a R2018a challenge group :)
Solution Comments
Show commentsProblem Recent Solvers70
Suggested Problems
-
Find the longest sequence of 1's in a binary sequence.
6262 Solvers
-
Remove all the words that end with "ain"
2310 Solvers
-
962 Solvers
-
295 Solvers
-
711 Solvers
More from this Author29
Problem Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!