Sort cell strings according to specific subsets of those cell strings

Question

Dr. Seis on 22 Jan 2012

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/26725-sort-cell-strings-according-to-specific-subsets-of-those-cell-strings

Let's say I have a cell string with values:

filename = {'2009.272.17.57.23.8445.AZ.SMER..BHE.R.SAC';...  
              '2009.272.17.57.24.5500.AZ.FRD..BHN.R.SAC';...   
              '2009.272.17.57.27.5445.AZ.SMER..BHN.R.SAC';...
              '2009.272.17.57.27.8000.AZ.SND..BHZ.R.SAC';... 
              '2009.272.17.57.27.9445.AZ.BZN..BHE.R.SAC';...
              '2009.272.17.57.28.7000.AZ.SND..BHN.R.SAC';...
              '2009.272.17.57.29.1250.AZ.FRD..BHZ.R.SAC';...
              '2009.272.17.57.29.2250.AZ.PFO..BHE.R.SAC';... 
              '2009.272.17.57.29.3695.AZ.SMER..BHZ.R.SAC';...
              '2009.272.17.57.29.9445.AZ.BZN..BHN.R.SAC';...
              '2009.272.17.57.30.0000.AZ.RDM..BHN.R.SAC';...
              '2009.272.17.57.30.8000.AZ.RDM..BHZ.R.SAC';...
              '2009.272.17.57.31.8250.AZ.LVA2..BHZ.R.SAC';...
              '2009.272.17.57.31.8500.AZ.LVA2..BHE.R.SAC';...
              '2009.272.17.57.31.9195.AZ.BZN..BHZ.R.SAC';... 
              '2009.272.17.57.32.0000.AZ.WMC..BHZ.R.SAC';...   
              '2009.272.17.57.32.6750.AZ.WMC..BHN.R.SAC';...   
              '2009.272.17.57.33.3195.AZ.KNW..BHZ.R.SAC';...   
              '2009.272.17.57.33.4750.AZ.TRO..BHN.R.SAC';...   
              '2009.272.17.57.33.7750.AZ.PFO..BHN.R.SAC';...   
              '2009.272.17.57.33.9000.AZ.PFO..BHZ.R.SAC';...   
              '2009.272.17.57.34.1750.AZ.LVA2..BHN.R.SAC';...  
              '2009.272.17.57.34.8000.AZ.TRO..BHZ.R.SAC';...   
              '2009.272.17.57.35.0000.AZ.WMC..BHE.R.SAC';...   
              '2009.272.17.57.35.0750.AZ.RDM..BHE.R.SAC';...   
              '2009.272.17.57.35.8945.AZ.KNW..BHE.R.SAC';...   
              '2009.272.17.57.36.0250.AZ.FRD..BHE.R.SAC';...   
              '2009.272.17.57.36.2250.AZ.CRY..BHZ.R.SAC';...  
              '2009.272.17.57.36.3500.AZ.CRY..BHN.R.SAC';...   
              '2009.272.17.57.36.4500.AZ.SND..BHE.R.SAC';...   
              '2009.272.17.57.36.5000.AZ.TRO..BHE.R.SAC';...   
              '2009.272.17.57.36.5195.AZ.KNW..BHN.R.SAC';...   
              '2009.272.17.57.36.5750.AZ.CRY..BHE.R.SAC'};

I want to be able to assume that I do not know what character the station name (e.g., CRY) or component name (e.g., BHE) starts and ends on. Though, the number of periods (".") will be consistent.

I have something fairly clunky to do this, but I am wondering if anyone can suggest a quick one/two-liner that would assume a string format of the general form:

YYYY.DDD.HH.MM.SS.ssss.$1.$2..$3.R.SAC

where:

$1 = Array name $2 = Station name $3 = Component name

And then sort the list with the primary and secondary sort order according to $2 and $3, respectively, so that the first 6 rows in the cell string would be:

272.17.57.27.9445.AZ.BZN..BHE.R.SAC
272.17.57.29.9445.AZ.BZN..BHN.R.SAC
272.17.57.31.9195.AZ.BZN..BHZ.R.SAC
272.17.57.36.5750.AZ.CRY..BHE.R.SAC
272.17.57.36.3500.AZ.CRY..BHN.R.SAC
272.17.57.36.2250.AZ.CRY..BHZ.R.SAC
...

4 Comments
Show 2 older commentsHide 2 older comments

Jan on 22 Jan 2012

It looks like the parts do *not* have the same length:

'2009.272.17.57.33.9000.AZ.PFO..BHZ.R.SAC'

'2009.272.17.57.34.1750.AZ.LVA2..BHN.R.SAC'

Dr. Seis on 22 Jan 2012

Oh, his question was related to the "component" name, which are all the same number of characters (i.e., 3). The "station" names are not the same - they range from 3 to 4 characters.

Sign in to comment.

Sign in to answer this question.

Answer 1

Oleg Komarov on 22 Jan 2012

2
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/26725-sort-cell-strings-according-to-specific-subsets-of-those-cell-strings#answer_34850

Open in MATLAB Online

% Split using |'.'| as the delimiter
splt = regexpi(filename,'\.','split');
% Sort according to the 8th and 10th column
[sorted,idx] = sortrows(cat(1,splt{:}),[8,10])

Now you can use the sorted split array or apply idx to filename

2 Comments
Show NoneHide None

Dr. Seis on 22 Jan 2012

Just what I was looking for. Thanks, Oleg!

Jan on 23 Jan 2012

+1 for the compact REGEXP call.

Sign in to comment.

Answer 2

Jan on 22 Jan 2012

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/26725-sort-cell-strings-according-to-specific-subsets-of-those-cell-strings#answer_34849

Open in MATLAB Online

filename = {'2009.272.17.57.23.8445.AZ.SMER..BHE.R.SAC';...  
            '2009.272.17.57.24.5500.AZ.FRD..BHN.R.SAC';...   
            '2009.272.17.57.27.5445.AZ.SMER..BHN.R.SAC';...
            '2009.272.17.57.27.8000.AZ.SND..BHZ.R.SAC';... 
            '2009.272.17.57.27.9445.AZ.BZN..BHE.R.SAC';...
            '2009.272.17.57.28.7000.AZ.SND..BHN.R.SAC';...
            '2009.272.17.57.29.1250.AZ.FRD..BHZ.R.SAC';...
            '2009.272.17.57.29.2250.AZ.PFO..BHE.R.SAC'};
n = numel(filename);
C2 = cell(1, n);
C3 = cell(1, n);
for iC = 1:n
  D      = textscan(filename{iC}(27:end), '%s', 'Delimiter', '.');
  C2{iC} = D{1}{1};
  C3{iC} = D{1}{3};
end
% A kind of SORTROWS:
[dummy, ind3] = sort(C3);
[dummy, ind2] = sort(C2(ind3));
index         = ind3(ind2);
filename      = filename(index);

3 Comments
Show 1 older commentHide 1 older comment

Dr. Seis on 22 Jan 2012

Thanks for the updated code... +1!

Jan on 23 Jan 2012

While Oleg's REGEXP is much nicer than calling TEXTSCAN in a loop, SORTROWS does exactly the same as my sorting method, but with a lot of overhead.

Sign in to comment.

Answer 3

Dr. Seis on 22 Jan 2012

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/26725-sort-cell-strings-according-to-specific-subsets-of-those-cell-strings#answer_34846

Open in MATLAB Online

Here is the clunky version I have been using:

     numFiles = numel(filename);
     sortcell = {''};
     sortind = zeros(numFiles,4);
     for i = 1 : numFiles
         sortind(i,2)=strfind(filename{i},'..')-1;
         for j = sortind(i,2):-1:1
             if isequal(filename{i}(j),'.')
                 break;
             end
             sortind(i,1)=j;
         end
         sortind(i,3)=sortind(i,2)+3;
         for j = sortind(i,3):length(filename{i})
             if isequal(filename{i}(j),'.')
                 break;
             end
             sortind(i,4)=j;
         end
         sortcell(i,1)=cellstr(filename{i}(sortind(i,1):sortind(i,2)));
         sortcell(i,2)=cellstr(filename{i}(sortind(i,3):sortind(i,4)));
     end
     [tempcell,tempind1]=sort(sortcell(:,2));
     [tempcell,tempind2]=sort(sortcell(tempind1,1));
     filename = filename(tempind1(tempind2));

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sort cell strings according to specific subsets of those cell strings

4 Comments
Show 2 older commentsHide 2 older comments

Accepted Answer

2 Comments
Show NoneHide None

More Answers (2)

3 Comments
Show 1 older commentHide 1 older comment

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

Sort cell strings according to specific subsets of those cell strings

4 Comments Show 2 older commentsHide 2 older comments

Accepted Answer

2 Comments Show NoneHide None

More Answers (2)

3 Comments Show 1 older commentHide 1 older comment

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

4 Comments
Show 2 older commentsHide 2 older comments

2 Comments
Show NoneHide None

3 Comments
Show 1 older commentHide 1 older comment

0 Comments
Show -2 older commentsHide -2 older comments