File Exchange

image thumbnail

sort_nat: Natural Order Sort

version 1.3.0.0 (2.26 KB) by Douglas Schwarz
Sort strings in natural order.

63 Downloads

Updated 22 Jan 2011

View License

Editor's Note: This file was selected as MATLAB Central Pick of the Week

Natural order sorting sorts strings containing digits in a way such that the numerical value of the digits is taken into account. It is especially useful for sorting file names containing index numbers with different numbers of digits. Often, people will use leading zeros to get the right sort order, but with this function you don't have to do that. For example, with input of

{'file1.txt','file2.txt','file10.txt'}

a normal sort will give you

{'file1.txt','file10.txt','file2.txt'}

whereas, sort_nat will give you

{'file1.txt','file2.txt','file10.txt'}

Comments and Ratings (90)

s

sun wei

Thanks very muck !!!!

feng wang

非常感谢,看似简单的处理,但是却解决了让人很头疼的问题!!

Great submission, and it is a must in many of my scripts. If you happen to need to use this for strings with decimals in them, you can try to replace the regex '\d+' with '\d+(?:\.\d+)?', for example the current version does this:

A = { '1.test', '10.test', '2.test', '1.3.test' }

sort_nat(A)

ans =

1×4 cell array

'1.3.test' '1.test' '2.test' '10.test'

But with the change you get '1.test' '1.3.test' '2.test' '10.test'.

Kathy Song

^^Thank you soooooo much! It should be a required function for Matlab.

Justin Lee

Very useful, thanks!!

This is a fantastic utility. Thanks!

Oren Shriki

Super handy

Mauro

Thanks Douglas

Antonio, this page is for my submission, sort_nat. natsort and natsortfiles are submissions from another author. Search for them.

I am working with 2016a Matlab version and I don't find this function: natsort, natsortfiles

Help please!!!!!!!!!

LIU Jordan

Luca Daccà

Very useful! Thank you!

kes22

Will Turner

Thanks!

Exactly what I needed, great tool! Thanks a million! :)

exactly what I needed!

Patricia

Thank you sooo much :-)

Nice and Tidy!

Eli F

wei fang

a very nice tool, thanks!

eFKa

perfect, thanks. Have a nice day :)

eFKa, sort_nat requires a cell array of strings. You can do what you want like this:

[~,order] = sort_nat({files.name});
files_sorted = files(order);

We are using the second output to get the order and then applying that order to your original structure.

Doug

eFKa

i have a 50x1 struct called files and a field called name like so

'10up.txt'
'11up.txt'
.
.
.
'19up.txt'
'1up.txt'

how do i use sort_nat(files.name) gives me error saying "too many input arguments"

Demo

Sangjae

Thanks!!

JWall

jenny

Excellent! Works flawlessly, saved some time of an electronic engineer. Thank you!

JavaDuncan

Ulysses C

nathan q

Sorting filenames using a sort function such as "sort_nat" can return non-user-intuitive ordering, as some longer filenames will sort before shorter filenames. This is due to char(0:45), including [ !"#$%&'()*+,-], sorting before the period '.' (char(46) used as the extension separator. For example:

fnm = {...
'test2.m';
'test.m';
'test10.m';
'test1.m';
'test-A.m';
'test_A.m';
'test10-old.m'};

% "sort" gives the wrong order for numeric substrings:
sort(fnm)
ans = {...
'test-A.m';
'test.m';
'test1.m';
'test10-old.m';
'test10.m';
'test2.m';
'test_A.m'}

% "sort_nat" gives numeric substrings in the correct order, but with '-' before '.':
sort_nat(fnm)
ans = {...
'test-A.m';
'test.m';
'test1.m';
'test2.m';
'test10-old.m';
'test10.m';
'test_A.m'}
% Note how 'test-A.m' occurs before the shorter named 'test.m', and likewise 'test10-old.m' before 'test10.m'. Users expecting shorter filenames to sort before longer ones need to sort using a different algorithm...

One solution is to sort the filename and file extensions separately, for which I wrote a function "natsortfiles":
natsortfiles(fnm, '\d+', 'beforechar')
ans = {...
'test.m';
'test1.m';
'test2.m';
'test10.m';
'test10-old.m';
'test-A.m';
'test_A.m'}

It allows control over case sensitivity, sort direction, and numeric substring matching. You can find this function on FEX here:
http://www.mathworks.com/matlabcentral/fileexchange/47434
It also accepts fullpaths, for which it sorts each level of the directory hierarchy separately too.

Maryam

Thank you very much. It is in the same folder. This is why I am very surprised.

Maryam, make sure that sort_nat.m is in a folder that is on your MATLAB path. You can read more about the path with "doc path".

Maryam

Hello. I wan to use this program to sort a vector a strings which have numerical values in their name. I am giving this error: Undefined function 'sort_nat' for input arguments of type 'cell'

what should I do?

Sergey,
Well, it's not a bug, but you might disagree with my logic. Remember, we are sorting strings, not numbers. '01' comes after '1' because they have the same numerical value, but '01' is a longer string. Likewise, '000' comes after '00', etc.
It may be possible to include '+-.,' by changing the regular expressions (currently just looking for runs of digits with '\d+'.

Serge

Very fast! I can't beat it!

This might be a bug though, not sure?
sort_nat({'01E2' '1E3'})
ans = '1E3' '01E2'

I came up with a version that can treat '+-.,' as part of the number, but its 10x slower. Not sure if sort_nat can be extended to do same.

alex

thank you very much!
i was typing C=[list.name] and not
C={list.name} which is the right.
thank you very much again!
very usefull function!

Alex, your input has to be a cell array:

>> C = {'1.bmp','10.bmp','11.bmp','2.bmp'};
>> sort_nat(C)
ans =
'1.bmp' '2.bmp' '10.bmp' '11.bmp'

alex

alex

i find this very useful,but i cant use it.

C =

1.bmp10.bmp11.bmp2.bmp

>> [S,INDEX] = sort_nat(C)

Cell contents reference from a non-cell array object.

Error in sort_nat (line 62)

num_val(i,z(i,:)) = sscanf(sprintf('%s ',digruns{i}{:}),'%f');

any help?

Calum

Very useful - worked was described! Thanks.

JiaDa

It work well and help a lot. So awesome!!!
Thank you

Dirk

Thank you so much, saved me a lot of time! Dirk

Paul

Erwin

Yi Sui

Gregory

Ekaterina

Sorry, I am a beginner in MATLAB and I am not sure what the input arguments in these function mean. There are many input argument.... I have a struct array (conists of files with names ex.123_1, 123_2, etc) A <3000*1> struct and I want to sort it according to name of files which are there. How shoud I use this function... Please help me.. best wiches Katya

thanks a lot!!! great

Harish S

Excellent

Nopparat

Muhammet

Caleb

Just what I needed, thanks!

I have wasted a lot of time trying to properly sort my image files for my image processing project before I surfed on to this code. And mathworks web tutorial on this is totally flawed. Great work! Certainly deserves a 5 star.

Excellent work. Saved me a much time and headache with some data analysis over here at NASA. Thank you!

Ben

This is great! Better to include it in the Matlab!

Johanna

Jun

Many thanks, save me time and effort to write my own code to crack this problem

Carsten

Great, works straight off in Octave too.

Evgeny Pr

Douglas Schwarz,
You offered the perfect solution for case-insensitive sorting.
Thank you!

Evgeny, you must be the first person to use descending order and the index. You are quite right, it's a bug and I thank you for identifying it. I fixed it in a slightly different way and just uploaded the new version.

To do case-insensitive sorting you can just do this:
c = {'a1', 'a2', 'a10', 'b1', 'X1', 'A10'};
[unused,index] = sort_nat(lower(c));
cs = c(index);

I want sort_nat to work the same way as sort and since sort doesn't do case-insensitive sorting I left that out of sort_nat as well.

Regards,
Doug

Evgeny Pr

This code not entirely correct:
if strcmpi(mode,'ascend')
cs = c(index);
else
cs = c(index(end:-1:1));
end

Output argument INDEX does not change depending on the sort order.

Better to do so:
if strcmpi(mode, 'descend')
index = flipud(index);
end
cs = c(index);
index = reshape(index, size(cs)); % same as C array dimension

Evgeny Pr

Hi! Very good function and coding!
I could not write such a elegant and fast function. :)

One question:
You would not want to make sort mode a case-insensitive?

For Example:
c = {'a1', 'a2', 'a10', 'b1', 'X1', 'A10'}

Case-sensetive sort:
cs = {'A10', 'X1', 'a1', 'a2', 'a10', 'b1'}

Case-insensitive sort:
cs = {'a1', 'a2', 'a10', 'A10', 'b1', 'X1'}

Oscar

Pete

Excellent. 5 stars.

I think I ran into the need for a descending sort because I was looking to search through various files with timestamps in their name for the most recent entry (of something-or-other).

Anyway, good work.

Pete, I'm not sure sorting in descending order will be used very often, but, as you say, it's easy to add so why not? I chose to do it in a slightly different way, but it works the way you want. Thanks for the suggestion. Update should appear soon.

Pete

Good work.

Is it just me though or would this not benefit from being able to specify the direction? This would be as simple as:

function [cs,index] = sort_nat(c,varargin)

... <entire code>

if nargin>1
if strcmpi(varargin{1},'descend')
index = flipud(index);
cs = c(index);
end
end

Adam Baker

I wish I'd found this sooner!

Greg Fichter

Nifty, thanks.

Nikola Toljic

Thanks

Gang Xu

Nice! Thanks

Hans van Dijk

Great, just what I needed.

F Moisy

Perfect. Should be included in R2007a!

Mike Palumbo

Great, much better than another file on the exchange "sortn", sortn froze when trying to sort over 6000 strings, this sort did it just fine in about a second.

Updates

1.3.0.0

Fixed bug identified by Evgeny Pr. (Thanks!)

1.2.0.0

Added ability to sort in descending order.

1.1.0.0

Steve Herman identified an obscure bug (sorting a cell array of one string which has no numeric characters) which has now been fixed. Thank you Steve!

1.0.0.0

Fixed ambiguity of sort order in certain cases e.g., {'a0','a00'}. Increased speed. Relaxed MATLAB version requirements -- no longer requires R2006a, should work with much older versions now.

MATLAB Release Compatibility
Created with R14SP2
Compatible with any release
Platform Compatibility
Windows macOS Linux