Extract numbers between two underscores.

Hi;
I am new to MATLAB and I need to extract numbers between two underscores. The only way I think this can be done is using the 'regexp' function. However, I am not sure about the expression that I need to the match it with. Can anyone help me?
Thanks;
Examples of my strings are:
Color_84_2014-01-31-16-49-31-702.jpg
Color_85_2014-01-31-16-49-31-732.jpg
Color_86_2014-01-31-16-49-31-762.jpg
Color_87_2014-01-31-16-49-31-792.jpg
So I just need the number 84,85,86,87,...
Thanks Again!

1 Comment

Do you have all these strings in e.g. a file and you want to extract all the numbers in one shot, or are you processing these string one at a time?

Sign in to comment.

Answers (3)

>> s='Color_84_2014-01-31-16-49-31-702.jpg';
>> sscanf(s,'Color_%d_%d-%d-%d-%d-%d-%d-%d.jpg')
ans =
84
2014
1
31
16
49
31
702
>>
You could just use the strfind function to find the underscores, and then extract the data in between those two values:
str = 'Color_84_2014-01-31-16-49-31-702.jpg';
idcs = strfind(str,'_');
num = str2num(str(idcs(1)+1:idcs(2)-1));
A regexp alternative probably exists. Try the above and see what happens!

3 Comments

num = str2double( regexp( str, '(?<=_)\d+(?=_)', 'match' )) ;
or
tokens = regexp( str, '_(\d+)_', 'tokens' ) ;
num = str2double( cat( 1, tokens{:} )) ;
Neat - I wondered about that!
Cedric
Cedric on 21 Jun 2014
Edited: Cedric on 21 Jun 2014
The pattern in the first approach matches:
  • One or more numeric characters: '\d+', where \d is the wildcard/operator for numeric characters and + is the quantifier "one or more".
  • Preceded by an underscore: '(?<=_)' where (?<=..) is a positive "lookbehind" for ...
  • Followed by an underscore: '(?=_)' where (?=..) is a positive "lookahead" for ...
Note that the "lookaround" operators are not included in the output.
The second approach is based on tokens extraction. The pattern says: an underscore followed by one or more numeric characters followed by an underscore, which is the following match '_\d+_', but we group the \d+ in a token (defined by the framing parentheses) and we ask REGEXP to extract just the token (with the 3rd 'tokens' argument in the call).

Sign in to comment.

str = 'Color_84_2014-01-31-16-49-31-702.jpg'
num = sscanf(str,'Color_%d')
If your strings are stored in a cell array of strings C, use cell fun
num = cellfun(@(str) sscanf(str,'Color_%d') , C)

1 Comment

>> sscanf(s,'Color_%d')
ans =
84
Gotta have enough formats to convert. Also note that despite the title there are dashes as well as underscores in the string.

Sign in to comment.

Categories

Asked:

on 20 Jun 2014

Edited:

on 21 Jun 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!