How to get numbers from string using match regular expression?

86 views (last 30 days)
Hi! I really need your help. I have a list of txt filenames in xls file - all filenames start with digits, but then their names vary greatly. I basicly want to load this xls file into MATLAB, import the filenames, go to another directory and search for *.txt files based on filenames from my xls.
I import xls file using:
[num,txt,raw] = xlsread(filename);
the 'txt' contains the *.txt filenames I'm looking for. This is how my xls file looks:
1 A/B
2 A/B
...
10 A/B
...
100 A/B
101 A/B + 103 [01/01/13] + 106 [02/01/13]
110 A/B
111 A/B + 113 [01/01/13]
115 A/B
122 A/B
I need to import all those numbers: 1, 2, 10, 111, 101, 103, 106, 110, 111, 113... and skip all the comments, date etc. I need those numbers to import 1.txt, 2.txt, 111.txt, 113.txt etc. files into MATLAB in the next step. How can I do it in MATLAB? I've tried finding the answer but can not find regular expression that fits this purpose. Filenames starts with digits (1, up to 4), some chars. can be always skipped: 'A/B', '+', format of date is always '[...]' - so I thought there might be a regular expression for this purpose :) I hope it's not too confusing. Maybe there is another way to do it?
Thanks for your help.

Accepted Answer

Daniel Shub
Daniel Shub on 26 Feb 2013
I think
regexp(x, '(^|\s)(\d*)(\s)', 'tokens')
will extract the numbers you want. It will return a cell array where each element is a 1x3 cell array, the numbers you want are the second element ...

More Answers (1)

Azzi Abdelmalek
Azzi Abdelmalek on 26 Feb 2013
Edited: Azzi Abdelmalek on 26 Feb 2013
a=cellfun(@(x) regexp(x,'\d\d?\d?\d?','match','once'),txt,'un',0);
  2 Comments
Witold
Witold on 26 Feb 2013
Thanks for your help. It works fine with digitals at the begining, but in case of:
101 A/B + 103 [01/01/13] + 106 [02/01/13]
103 and 106 are missing /:
I think that one possible option is to:
- find and delete string from '[' to ']'
- find and delete all chars, not numbers like A B C D...
- delete all +,-,? etc.
- get numbers
BTW. I want an output of this to be a vector of numbers like:
[1,2,3,...,101,103,104]
not:
1
2
..
101 103
104
Azzi Abdelmalek
Azzi Abdelmalek on 26 Feb 2013
a=cellfun(@(x) regexp(x,'(\[.*?\])|\D','split'),txt,'un',0)
out=cellfun(@(x) x(~cellfun(@isempty,x)),a,'un',0)

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!