How to parse information between two strings using regular expressions?

38 views (last 30 days)
Hello,
I am trying to parse some information contained between the two strings "<sample>" and "</sample>" . I am new to regular expressions and would like to know what expression suits my requirement. The strings i mentioned have some operators in them. This is making the job difficult.
Regards, Math
  2 Comments
Guillaume
Guillaume on 1 Dec 2014
If you need more help than Thorsten's answer (which pretty much tells you everything that there is to it), then show us your current regular expression.
Math
Math on 2 Dec 2014
Edited: Math on 2 Dec 2014
"str = '<sample>a,b,c</sample>' "
I want to extract a,b,c from the str above into another string (say extract = a,b,c).
I wrote the following pattern:
"pat = '(<sample>)||(<\/sample>)' "
However, extract = regexp(str,pat,'match') yields " '<sample></sample>' " and not a,b,c.
help regexp says
| Match subexpression before or after the |
How do i get a,b,c to the string variable extract ? Please help.

Sign in to comment.

Accepted Answer

Andrei Bobrov
Andrei Bobrov on 2 Dec 2014
Edited: Andrei Bobrov on 2 Dec 2014
str = '<sample>a,b,c</sample>';
out = regexp(str,'((?<=<sample>).*(?=<\/sample>))','match')
or
t = regexp(str,'<(|\/)sample>','splite')
out = t(~cellfun(@isempty,t))

More Answers (2)

Thorsten
Thorsten on 1 Dec 2014
Edited: Thorsten on 1 Dec 2014
help regexp
There it says
Characters that are not special metacharacters are all treated literally in
a match. To match a character that is a special metacharacter, escape that
character with a '\'.

Niels
Niels on 2 Dec 2014
Alternatively, you may also consider using regexprep instead of regexp.
>> extract = regexprep(str,pat,'')
extract =
a,b,c

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!