Code covered by the BSD License  

Highlights from
sort_nat: Natural Order Sort

4.9697

5.0 | 35 ratings Rate this file 106 Downloads (last 30 days) File Size: 2.26 KB File ID: #10959

sort_nat: Natural Order Sort

by

 

03 May 2006 (Updated )

Sort strings in natural order.

Editor's Notes:

This file was selected as MATLAB Central Pick of the Week

| Watch this File

File Information
Description

Natural order sorting sorts strings containing digits in a way such that the numerical value of the digits is taken into account. It is especially useful for sorting file names containing index numbers with different numbers of digits. Often, people will use leading zeros to get the right sort order, but with this function you don't have to do that. For example, with input of

{'file1.txt','file2.txt','file10.txt'}

a normal sort will give you

{'file1.txt','file10.txt','file2.txt'}

whereas, sort_nat will give you

{'file1.txt','file2.txt','file10.txt'}

Acknowledgements

This file inspired Lilo : Fast And Pixel Accurate Stitching For 2 D Tiles, Project Management, Customizable Natural Order Sort, Natural Order Filename Sort, and Natural Order Row Sort.

MATLAB release MATLAB 7.0.4 (R14SP2)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (48)
03 Aug 2014 Stephen Cobeldick

Sorting filenames using a sort function such as "sort_nat" can return non-user-intuitive ordering, as some longer filenames will sort before shorter filenames. This is due to char(0:45), including [ !"#$%&'()*+,-], sorting before the period '.' (char(46) used as the extension separator. For example:

fnm = {...
'test2.m';
'test.m';
'test10.m';
'test1.m';
'test-A.m';
'test_A.m';
'test10-old.m'};

% "sort" gives the wrong order for numeric substrings:
sort(fnm)
ans = {...
'test-A.m';
'test.m';
'test1.m';
'test10-old.m';
'test10.m';
'test2.m';
'test_A.m'}

% "sort_nat" gives numeric substrings in the correct order, but with '-' before '.':
sort_nat(fnm)
ans = {...
'test-A.m';
'test.m';
'test1.m';
'test2.m';
'test10-old.m';
'test10.m';
'test_A.m'}
% Note how 'test-A.m' occurs before the shorter named 'test.m', and likewise 'test10-old.m' before 'test10.m'. Users expecting shorter filenames to sort before longer ones need to sort using a different algorithm...

One solution is to sort the filename and file extensions separately, for which I wrote a function "natsortfiles":
natsortfiles(fnm, '\d+', 'beforechar')
ans = {...
'test.m';
'test1.m';
'test2.m';
'test10.m';
'test10-old.m';
'test-A.m';
'test_A.m'}

It allows control over case sensitivity, sort direction, and numeric substring matching. You can find this function on FEX here:
http://www.mathworks.com/matlabcentral/fileexchange/47434
It also accepts fullpaths, for which it sorts each level of the directory hierarchy separately too.

26 Jun 2014 Maryam

Thank you very much. It is in the same folder. This is why I am very surprised.

26 Jun 2014 Douglas Schwarz

Maryam, make sure that sort_nat.m is in a folder that is on your MATLAB path. You can read more about the path with "doc path".

26 Jun 2014 Maryam

Hello. I wan to use this program to sort a vector a strings which have numerical values in their name. I am giving this error: Undefined function 'sort_nat' for input arguments of type 'cell'

what should I do?

22 May 2014 Morad Kassem  
07 Apr 2014 Douglas Schwarz

Sergey,
Well, it's not a bug, but you might disagree with my logic. Remember, we are sorting strings, not numbers. '01' comes after '1' because they have the same numerical value, but '01' is a longer string. Likewise, '000' comes after '00', etc.
It may be possible to include '+-.,' by changing the regular expressions (currently just looking for runs of digits with '\d+'.

06 Apr 2014 Sergey

Very fast! I can't beat it!

This might be a bug though, not sure?
sort_nat({'01E2' '1E3'})
ans = '1E3' '01E2'

I came up with a version that can treat '+-.,' as part of the number, but its 10x slower. Not sure if sort_nat can be extended to do same.

19 Feb 2014 alex

thank you very much!
i was typing C=[list.name] and not
C={list.name} which is the right.
thank you very much again!
very usefull function!

19 Feb 2014 Douglas Schwarz

Alex, your input has to be a cell array:

>> C = {'1.bmp','10.bmp','11.bmp','2.bmp'};
>> sort_nat(C)
ans =
'1.bmp' '2.bmp' '10.bmp' '11.bmp'

19 Feb 2014 alex  
19 Feb 2014 alex

i find this very useful,but i cant use it.

C =

1.bmp10.bmp11.bmp2.bmp

>> [S,INDEX] = sort_nat(C)

Cell contents reference from a non-cell array object.

Error in sort_nat (line 62)

num_val(i,z(i,:)) = sscanf(sprintf('%s ',digruns{i}{:}),'%f');

any help?

11 Nov 2013 Calum

Very useful - worked was described! Thanks.

29 Oct 2013 JiaDa

It work well and help a lot. So awesome!!!
Thank you

25 Sep 2013 Dirk

Thank you so much, saved me a lot of time! Dirk

06 Sep 2013 Paul  
01 Aug 2013 Erwin  
18 Jul 2013 Yi Sui  
09 Jul 2013 Gregory  
28 Apr 2013 Pete  
26 Mar 2013 Ekaterina

Sorry, I am a beginner in MATLAB and I am not sure what the input arguments in these function mean. There are many input argument.... I have a struct array (conists of files with names ex.123_1, 123_2, etc) A <3000*1> struct and I want to sort it according to name of files which are there. How shoud I use this function... Please help me.. best wiches Katya

07 Mar 2013 salvatore savo

thanks a lot!!! great

30 Jan 2013 Harish S

Excellent

24 Jan 2013 Nopparat  
04 Dec 2012 Muhammet  
18 Jul 2012 Caleb

Just what I needed, thanks!

01 May 2012 Shamir Alavi

I have wasted a lot of time trying to properly sort my image files for my image processing project before I surfed on to this code. And mathworks web tutorial on this is totally flawed. Great work! Certainly deserves a 5 star.

03 Jan 2012 Damon Bradley

Excellent work. Saved me a much time and headache with some data analysis over here at NASA. Thank you!

03 Jan 2012 Damon Bradley  
28 Jul 2011 Ben

This is great! Better to include it in the Matlab!

05 Jul 2011 Johanna  
26 May 2011 Jun

Many thanks, save me time and effort to write my own code to crack this problem

27 Apr 2011 Carsten  
10 Feb 2011 Edwin Carter

Great, works straight off in Octave too.

22 Jan 2011 Evgeny Pr

Douglas Schwarz,
You offered the perfect solution for case-insensitive sorting.
Thank you!

22 Jan 2011 Douglas Schwarz

Evgeny, you must be the first person to use descending order and the index. You are quite right, it's a bug and I thank you for identifying it. I fixed it in a slightly different way and just uploaded the new version.

To do case-insensitive sorting you can just do this:
c = {'a1', 'a2', 'a10', 'b1', 'X1', 'A10'};
[unused,index] = sort_nat(lower(c));
cs = c(index);

I want sort_nat to work the same way as sort and since sort doesn't do case-insensitive sorting I left that out of sort_nat as well.

Regards,
Doug

22 Jan 2011 Evgeny Pr

This code not entirely correct:
if strcmpi(mode,'ascend')
cs = c(index);
else
cs = c(index(end:-1:1));
end

Output argument INDEX does not change depending on the sort order.

Better to do so:
if strcmpi(mode, 'descend')
index = flipud(index);
end
cs = c(index);
index = reshape(index, size(cs)); % same as C array dimension

22 Jan 2011 Evgeny Pr

Hi! Very good function and coding!
I could not write such a elegant and fast function. :)

One question:
You would not want to make sort mode a case-insensitive?

For Example:
c = {'a1', 'a2', 'a10', 'b1', 'X1', 'A10'}

Case-sensetive sort:
cs = {'A10', 'X1', 'a1', 'a2', 'a10', 'b1'}

Case-insensitive sort:
cs = {'a1', 'a2', 'a10', 'A10', 'b1', 'X1'}

12 Sep 2010 Oscar  
22 Aug 2010 Pete

Excellent. 5 stars.

I think I ran into the need for a descending sort because I was looking to search through various files with timestamps in their name for the most recent entry (of something-or-other).

Anyway, good work.

06 Apr 2010 Douglas Schwarz

Pete, I'm not sure sorting in descending order will be used very often, but, as you say, it's easy to add so why not? I chose to do it in a slightly different way, but it works the way you want. Thanks for the suggestion. Update should appear soon.

01 Apr 2010 Pete

Good work.

Is it just me though or would this not benefit from being able to specify the direction? This would be as simple as:

function [cs,index] = sort_nat(c,varargin)

... <entire code>

if nargin>1
if strcmpi(varargin{1},'descend')
index = flipud(index);
cs = c(index);
end
end

09 Dec 2008 Adam Baker

I wish I'd found this sooner!

29 Aug 2008 Greg Fichter

Nifty, thanks.

25 May 2008 Nikola Toljic

Thanks

02 Aug 2007 Gang Xu

Nice! Thanks

14 Dec 2006 Hans van Dijk

Great, just what I needed.

13 Sep 2006 F Moisy

Perfect. Should be included in R2007a!

26 Jun 2006 Mike Palumbo

Great, much better than another file on the exchange "sortn", sortn froze when trying to sort over 6000 strings, this sort did it just fine in about a second.

Updates
05 Nov 2008

Steve Herman identified an obscure bug (sorting a cell array of one string which has no numeric characters) which has now been fixed. Thank you Steve!

06 Apr 2010

Added ability to sort in descending order.

22 Jan 2011

Fixed bug identified by Evgeny Pr. (Thanks!)

Contact us