How can I extract a words with separate letters from a text?
2 views (last 30 days)
Show older comments
Nadjate
on 8 Apr 2014
Commented: Azzi Abdelmalek
on 9 Apr 2014
I have some ill-structured files like this:
file1: o n l i n e U N I V E R S I T Y Obtain Diploma.
file2: Howdy There C E R T I F I E D U N I V E R S I T Y D I P L O M A S and d e g r e sWouldn't it be great to get a Masters degree.
I would like to extract just the words with spaces in this case I will get:
f1: 'o n l i n e' 'U N I V E R S I T Y'
f2:'C E R T I F I E D U N I V E R S I T Y D I P L O M A S' 'd e g r e s'
I tried with regular expressions but it didn't work for me.
thanks for any help
2 Comments
Image Analyst
on 8 Apr 2014
Edited: Image Analyst
on 8 Apr 2014
For f2, why is the final "s" to be extracted as part of 'd e g r e s', and not considered as part of sWouldn't? I would think that it should not be included because it's attached to sWouldn't which is a word without spaces and thus, not to be extracted. Please clarify.
Accepted Answer
Azzi Abdelmalek
on 8 Apr 2014
Edited: Azzi Abdelmalek
on 8 Apr 2014
file1=' o n l i n e U N I V E R S I T Y Obtain Diploma.'
f1=regexp(file1,'(?<=\s)(\w\s)+','match')
f1(cellfun(@numel,f1)==2)=[]
2 Comments
Azzi Abdelmalek
on 9 Apr 2014
This also works
file1=' Howdy There C E R T I F I E D U N I V E R S I T Y D I P L O M A S and d e g r e sWouldn''t it be great to get a Masters degree.'
f1=regexp(file1,'(?<=\s)(\w\s)+','match');
f1(cellfun(@numel,f1)==2)=[];
celldisp(f1)
More Answers (0)
See Also
Categories
Find more on String Parsing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!