Code covered by the BSD License  

Highlights from
sort_nat: Natural Order Sort

4.9697
5.0 | 35 ratings Rate this file 149 Downloads (last 30 days) File Size: 2.26 KB File ID: #10959 Version: 1.3

sort_nat: Natural Order Sort

by

 

03 May 2006 (Updated )

Sort strings in natural order.

Editor's Notes:

This file was selected as MATLAB Central Pick of the Week

| Watch this File

File Information
Description

Natural order sorting sorts strings containing digits in a way such that the numerical value of the digits is taken into account. It is especially useful for sorting file names containing index numbers with different numbers of digits. Often, people will use leading zeros to get the right sort order, but with this function you don't have to do that. For example, with input of

{'file1.txt','file2.txt','file10.txt'}

a normal sort will give you

{'file1.txt','file10.txt','file2.txt'}

whereas, sort_nat will give you

{'file1.txt','file2.txt','file10.txt'}

Acknowledgements

This file inspired Lilo : Fast And Pixel Accurate Stitching For 2 D Tiles, Customizable Natural Order Sort, Natural Order Filename Sort, Natural Order Row Sort, and Philipptempel/Matlab Projectmanagement.

MATLAB release MATLAB 7.0.4 (R14SP2)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (48)
03 Aug 2014 Stephen Cobeldick

Sorting filenames using a sort function such as "sort_nat" can return non-user-intuitive ordering, as some longer filenames will sort before shorter filenames. This is due to char(0:45), including [ !"#$%&'()*+,-], sorting before the period '.' (char(46) used as the extension separator. For example:

fnm = {...
'test2.m';
'test.m';
'test10.m';
'test1.m';
'test-A.m';
'test_A.m';
'test10-old.m'};

% "sort" gives the wrong order for numeric substrings:
sort(fnm)
ans = {...
'test-A.m';
'test.m';
'test1.m';
'test10-old.m';
'test10.m';
'test2.m';
'test_A.m'}

% "sort_nat" gives numeric substrings in the correct order, but with '-' before '.':
sort_nat(fnm)
ans = {...
'test-A.m';
'test.m';
'test1.m';
'test2.m';
'test10-old.m';
'test10.m';
'test_A.m'}
% Note how 'test-A.m' occurs before the shorter named 'test.m', and likewise 'test10-old.m' before 'test10.m'. Users expecting shorter filenames to sort before longer ones need to sort using a different algorithm...

One solution is to sort the filename and file extensions separately, for which I wrote a function "natsortfiles":
natsortfiles(fnm, '\d+', 'beforechar')
ans = {...
'test.m';
'test1.m';
'test2.m';
'test10.m';
'test10-old.m';
'test-A.m';
'test_A.m'}

It allows control over case sensitivity, sort direction, and numeric substring matching. You can find this function on FEX here:
http://www.mathworks.com/matlabcentral/fileexchange/47434
It also accepts fullpaths, for which it sorts each level of the directory hierarchy separately too.

Comment only
26 Jun 2014 Maryam

Maryam (view profile)

Thank you very much. It is in the same folder. This is why I am very surprised.

Comment only
26 Jun 2014 Douglas Schwarz

Maryam, make sure that sort_nat.m is in a folder that is on your MATLAB path. You can read more about the path with "doc path".

Comment only
26 Jun 2014 Maryam

Maryam (view profile)

Hello. I wan to use this program to sort a vector a strings which have numerical values in their name. I am giving this error: Undefined function 'sort_nat' for input arguments of type 'cell'

what should I do?

22 May 2014 Morad Kassem  
07 Apr 2014 Douglas Schwarz

Sergey,
Well, it's not a bug, but you might disagree with my logic. Remember, we are sorting strings, not numbers. '01' comes after '1' because they have the same numerical value, but '01' is a longer string. Likewise, '000' comes after '00', etc.
It may be possible to include '+-.,' by changing the regular expressions (currently just looking for runs of digits with '\d+'.

Comment only
06 Apr 2014 Sergey

Sergey (view profile)

Very fast! I can't beat it!

This might be a bug though, not sure?
sort_nat({'01E2' '1E3'})
ans = '1E3' '01E2'

I came up with a version that can treat '+-.,' as part of the number, but its 10x slower. Not sure if sort_nat can be extended to do same.

19 Feb 2014 alex

alex (view profile)

thank you very much!
i was typing C=[list.name] and not
C={list.name} which is the right.
thank you very much again!
very usefull function!

Comment only
19 Feb 2014 Douglas Schwarz

Alex, your input has to be a cell array:

>> C = {'1.bmp','10.bmp','11.bmp','2.bmp'};
>> sort_nat(C)
ans =
'1.bmp' '2.bmp' '10.bmp' '11.bmp'

Comment only
19 Feb 2014 alex

alex (view profile)

 
19 Feb 2014 alex

alex (view profile)

i find this very useful,but i cant use it.

C =

1.bmp10.bmp11.bmp2.bmp

>> [S,INDEX] = sort_nat(C)

Cell contents reference from a non-cell array object.

Error in sort_nat (line 62)

num_val(i,z(i,:)) = sscanf(sprintf('%s ',digruns{i}{:}),'%f');

any help?

Comment only
11 Nov 2013 Calum

Calum (view profile)

Very useful - worked was described! Thanks.

29 Oct 2013 JiaDa

JiaDa (view profile)

It work well and help a lot. So awesome!!!
Thank you

25 Sep 2013 Dirk

Dirk (view profile)

Thank you so much, saved me a lot of time! Dirk

06 Sep 2013 Paul

Paul (view profile)

 
01 Aug 2013 Erwin

Erwin (view profile)

 
18 Jul 2013 Yi Sui

Yi Sui (view profile)

 
09 Jul 2013 Gregory  
28 Apr 2013 Peter

Peter (view profile)

 
26 Mar 2013 Ekaterina

Sorry, I am a beginner in MATLAB and I am not sure what the input arguments in these function mean. There are many input argument.... I have a struct array (conists of files with names ex.123_1, 123_2, etc) A <3000*1> struct and I want to sort it according to name of files which are there. How shoud I use this function... Please help me.. best wiches Katya

07 Mar 2013 salvatore savo

thanks a lot!!! great

30 Jan 2013 Harish S

Excellent

24 Jan 2013 Nopparat  
04 Dec 2012 Muhammet  
18 Jul 2012 Caleb

Caleb (view profile)

Just what I needed, thanks!

01 May 2012 Shamir Alavi

I have wasted a lot of time trying to properly sort my image files for my image processing project before I surfed on to this code. And mathworks web tutorial on this is totally flawed. Great work! Certainly deserves a 5 star.

03 Jan 2012 Damon Bradley

Damon Bradley (view profile)

Excellent work. Saved me a much time and headache with some data analysis over here at NASA. Thank you!

03 Jan 2012 Damon Bradley

Damon Bradley (view profile)

 
28 Jul 2011 Ben

Ben (view profile)

This is great! Better to include it in the Matlab!

05 Jul 2011 Johanna  
26 May 2011 Jun

Jun (view profile)

Many thanks, save me time and effort to write my own code to crack this problem

Comment only
27 Apr 2011 Carsten  
10 Feb 2011 Edwin Carter

Great, works straight off in Octave too.

22 Jan 2011 Evgeny Pr

Evgeny Pr (view profile)

Douglas Schwarz,
You offered the perfect solution for case-insensitive sorting.
Thank you!

Comment only
22 Jan 2011 Douglas Schwarz

Evgeny, you must be the first person to use descending order and the index. You are quite right, it's a bug and I thank you for identifying it. I fixed it in a slightly different way and just uploaded the new version.

To do case-insensitive sorting you can just do this:
c = {'a1', 'a2', 'a10', 'b1', 'X1', 'A10'};
[unused,index] = sort_nat(lower(c));
cs = c(index);

I want sort_nat to work the same way as sort and since sort doesn't do case-insensitive sorting I left that out of sort_nat as well.

Regards,
Doug

Comment only
22 Jan 2011 Evgeny Pr

Evgeny Pr (view profile)

This code not entirely correct:
if strcmpi(mode,'ascend')
cs = c(index);
else
cs = c(index(end:-1:1));
end

Output argument INDEX does not change depending on the sort order.

Better to do so:
if strcmpi(mode, 'descend')
index = flipud(index);
end
cs = c(index);
index = reshape(index, size(cs)); % same as C array dimension

Comment only
22 Jan 2011 Evgeny Pr

Evgeny Pr (view profile)

Hi! Very good function and coding!
I could not write such a elegant and fast function. :)

One question:
You would not want to make sort mode a case-insensitive?

For Example:
c = {'a1', 'a2', 'a10', 'b1', 'X1', 'A10'}

Case-sensetive sort:
cs = {'A10', 'X1', 'a1', 'a2', 'a10', 'b1'}

Case-insensitive sort:
cs = {'a1', 'a2', 'a10', 'A10', 'b1', 'X1'}

12 Sep 2010 Oscar

Oscar (view profile)

 
22 Aug 2010 Pete

Pete (view profile)

Excellent. 5 stars.

I think I ran into the need for a descending sort because I was looking to search through various files with timestamps in their name for the most recent entry (of something-or-other).

Anyway, good work.

06 Apr 2010 Douglas Schwarz

Pete, I'm not sure sorting in descending order will be used very often, but, as you say, it's easy to add so why not? I chose to do it in a slightly different way, but it works the way you want. Thanks for the suggestion. Update should appear soon.

Comment only
01 Apr 2010 Pete

Pete (view profile)

Good work.

Is it just me though or would this not benefit from being able to specify the direction? This would be as simple as:

function [cs,index] = sort_nat(c,varargin)

... <entire code>

if nargin>1
if strcmpi(varargin{1},'descend')
index = flipud(index);
cs = c(index);
end
end

09 Dec 2008 Adam Baker

I wish I'd found this sooner!

29 Aug 2008 Greg Fichter

Nifty, thanks.

25 May 2008 Nikola Toljic

Thanks

02 Aug 2007 Gang Xu

Nice! Thanks

14 Dec 2006 Hans van Dijk

Great, just what I needed.

13 Sep 2006 F Moisy

Perfect. Should be included in R2007a!

Comment only
26 Jun 2006 Mike Palumbo

Great, much better than another file on the exchange "sortn", sortn froze when trying to sort over 6000 strings, this sort did it just fine in about a second.

Updates
05 Nov 2008 1.1

Steve Herman identified an obscure bug (sorting a cell array of one string which has no numeric characters) which has now been fixed. Thank you Steve!

06 Apr 2010 1.2

Added ability to sort in descending order.

22 Jan 2011 1.3

Fixed bug identified by Evgeny Pr. (Thanks!)

Contact us