Using regexp in a repetitive manner

2 views (last 30 days)
Hi,
Given the following string:
[Some text]
type="line" coords="461,461,461,487,461,487,461,489,462,490,462,492,463,493,464,494,465,495,467,495,468,496,470,496,470,496,801,496,801,496,803,496"
[Some text]
type="line" coords="461,487,461,489,462,490,462,492,463,493,496"
How can we obtain the numbers using regexp function of MATLAB? Please note that the length of numbers in the string is variable, therefore, a " universal" expression is needed here.
Thanks in advance,

Accepted Answer

Guillaume
Guillaume on 4 Dec 2015
Edited: Walter Roberson on 4 Dec 2015
If you want the numbers from all the lines just as a vector and assuming there are no unwanted numbers in any other line then it's simply:
str = sprintf('%s\n%s\n%s\n%s\n', ...
'[Some text]', ...
'type="line" coords="461,461,461,487,461,487,461,489,462,490,462,492,463,493,464,494,465,495,467,495,468,496,470,496,470,496,801,496,801,496,803,496"', ...
'bla bla bla', ...
'type="line" coords="461,487,461,489,462,490,462,492,463,493,496"');
numbers = str2double(regexp(str, '\d+', 'match'))
if you want something more complicated (such as a cell array of vectors one per line or just the number after coords="), then clarify
  4 Comments
Guillaume
Guillaume on 4 Dec 2015
Edited: Guillaume on 4 Dec 2015
If the numbers have to be enclosed within the coords quotes, then you need a minimum of two regexp:
coordsegments = regexp(str, '(?<=coords=")[^"]*', 'match');
coords = regexp(coordsegments', '\d+', 'match')
coordvalues = cellfun(@str2double, coords, 'UniformOutput', false)
Ive J
Ive J on 4 Dec 2015
Yes, that is exactly what I was looking for.
Deeply appreciate your time.

Sign in to comment.

More Answers (1)

Thorsten
Thorsten on 4 Dec 2015
Edited: Thorsten on 4 Dec 2015
line = 'type="line" coords="461,461,461,487,461,487,461,489,462,490,462,492,463,493,464,494,465,495,467,495,468,496,470,496,470,496,801,496,801,496,803,496"'
data = sscanf(strrep(line, 'type="line" coords="', ''), '%d,');
line = 'type="line" coords="461,487,461,489,462,490,462,492,463,493,496"';
data = sscanf(strrep(line, 'type="line" coords="', ''), '%d,');
  3 Comments
Thorsten
Thorsten on 4 Dec 2015
Edited: Thorsten on 4 Dec 2015
Dear Ive,
I changed the code. If you can process the text line by line, you can extract the data. It works for different number of coords, and ignores all lines that do not start with 'type="line" coords="'.
Ive J
Ive J on 4 Dec 2015
Edited: Ive J on 4 Dec 2015
Unfortunately, it is not possible to distinguish among lines. Nevertheless, I deeply appreciate your time and help.

Sign in to comment.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!