Open Files by File Name Patterns

27 views (last 30 days)
I am looking for the best Syntax to recognize patterns in file names and open the desired files/filter out unwanted files.
For example,
I have files named
HI_1C_TTTT4_468.xlsx
HI_2C_TTTT9_456.xlsx
HI_8C_TTTT7_279_Plot.xlsx
HI_5678_5487.xlsx
and I only wish to open the first two files, which have similar patterns unlike the last two. Any advice/examples?
Thank you!

Accepted Answer

Guillaume
Guillaume on 1 Jul 2016
Edited: Guillaume on 1 Jul 2016
regular expressions seem like the perfect candidate. In your case:
s = {'HI_1C_TTTT4_468.xlsx';
'HI_2C_TTTT9_456.xlsx';
'HI_8C_TTTT7_279_Plot.xlsx';
'HI_5678_5487.xlsx'}
ismatch = ~cellfun(@isempty, regexp(s, '^HI_\dC_TTTT\d_\d{3}\.xlsx$', 'match', 'once'))
is one way to do it.
  3 Comments
Guillaume
Guillaume on 1 Jul 2016
for filename = s(ismatch)' %transpose since for iterates over columns
filename = filename{1}; %extract string from cell
%do something with filename
%...
end

Sign in to comment.

More Answers (2)

Thorsten
Thorsten on 1 Jul 2016
Edited: Thorsten on 1 Jul 2016
find(cell2mat(regexp(names, 'HI_\dC_TTTT\d_\d{3}\.xlsx', 'start')))
\d matches a digit, \d{3} matches 3 digits, \. matches "." (. needs to be escaped with \ because . matches any character)
  1 Comment
chlor thanks
chlor thanks on 1 Jul 2016
Thank you so much!! This helps a lot with your explanation! In this case, which syntax will then allow me to open these excels with to later import plots from? Sorry for asking all these dumb questions, I really appreciate your guide on this!

Sign in to comment.


dpb
dpb on 1 Jul 2016
Not amenable to the Matlab incarnation of the dir function, unfortunately--TMW has only implemented the '*' wildcard matching portion and there's not enough uniqueness of the proper type to differentiate the given list.
Your choices boil down to various ways of returning the list with what is partially desired and then winnow it down. Alternatives there vary from in this case you could simply look for those of the proper length(name) (not very robust) to writing a regexp parsing expression to make the match more specific and any number of variations in between regarding partial pattern matches.
Or, depending on the OS shell used, there may be the facilities within the OS to do better pattern matching and so return the desired list from a system call.
You can start in Matlab with
d=dir('HI*TTTT*.xlsx');
and winnow it down to the first three to sift through, but that's about the best Matlab itself can do for the starters.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!