MATLAB Examples

# NATSORTFILES Examples

The function NATSORTFILES sorts a cell array of filenames or filepaths, taking into account any number values within the strings. This is known as a "natural order sort" or an "alphanumeric sort". Note that MATLAB's inbuilt SORT function sorts the character codes only (as does sort in most programming languages).

NATSORTFILES is not a naive natural-order sort, but sorts the filenames and file extensions separately: this prevents the file extension separator character . and file extension itself from influencing the sort order of the complete filename+extension. Thus NATSORTFILES sorts shorter filenames before longer ones, which is known as a "dictionary sort". For the same reason filepaths are split at each path-separator character, and each directory level is sorted separately. See the "Explanation" sections below for more details.

For sorting the rows of a cell array of strings use NATSORTROWS.

For sorting a cell array of strings use NATSORT.

## Basic Usage:

By default NATSORTFILES interprets consecutive digits as being part of a single integer, each number is considered to be as wide as one letter:

```A = {'a2.txt', 'a10.txt', 'a1.txt'}; sort(A) natsortfiles(A) ```
```ans = 'a1.txt' 'a10.txt' 'a2.txt' ans = 'a1.txt' 'a2.txt' 'a10.txt' ```

## Output 2: Sort Index

The second output argument is a numeric array of the sort indices ndx, such that Y = X(ndx) where Y = natsortfiles(X):

```[~,ndx] = natsortfiles(A) ```
```ndx = 3 1 2 ```

## Example with DIR and a Cell Array

One common situation is using DIR to identify files in a folder, sort them into the correct order, and then loop over them: below is an example of how to do this. Remember to preallocate all output arrays before the loop!

```D = 'natsortfiles_test'; % directory path S = dir(fullfile(D,'*.txt')); % get list of files in directory N = natsortfiles({S.name}); % sort file names into order for k = 1:numel(N) fullfile(D,N{k}) end ```
```ans = natsortfiles_test\A_1.txt ans = natsortfiles_test\A_1-new.txt ans = natsortfiles_test\A_1_new.txt ans = natsortfiles_test\A_2.txt ans = natsortfiles_test\A_3.txt ans = natsortfiles_test\A_10.txt ans = natsortfiles_test\A_100.txt ans = natsortfiles_test\A_200.txt ```

## Example with DIR and a Structure

Users who need to access the DIR structure fields can use NATSORTFILE's second output to sort DIR's output structure into the correct order:

```D = 'natsortfiles_test'; % directory path S = dir(fullfile(D,'*.txt')); % get list of files in directory [~,ndx] = natsortfiles({S.name}); % indices of correct order S = S(ndx); % sort structure using indices for k = 1:numel(N) S(k).name; S(k).date; end ```

## Explanation: Dictionary Sort

Filenames and file extensions are separated by the extension separator, the period character ., which gets sorted after all of the characters from 0 to 45, including !"#\$%&'()*+,-, the space character, and all of the control characters (newlines, tabs, etc). This means that a naive sort or natural-order sort will sort some short filenames after longer filenames. In order to provide the correct dictionary sort, with shorter filenames first, NATSORTFILES sorts the filenames and file extensions separately:

```B = {'test_ccc.m'; 'test-aaa.m'; 'test.m'; 'test.bbb.m'}; sort(B) % '-' sorts before '.' natsort(B) % '-' sorts before '.' natsortfiles(B) % correct dictionary sort ```
```ans = 'test-aaa.m' 'test.bbb.m' 'test.m' 'test_ccc.m' ans = 'test-aaa.m' 'test.bbb.m' 'test.m' 'test_ccc.m' ans = 'test.m' 'test-aaa.m' 'test.bbb.m' 'test_ccc.m' ```

## Explanation: Filenames

NATSORTFILES combines a dictionary sort with a natural-order sort, so that the number values within the filenames are taken into consideration:

```C = {'test2.m'; 'test10-old.m'; 'test.m'; 'test10.m'; 'test1.m'}; sort(C) % Wrong numeric order. natsort(C) % Correct numeric order, but longer before shorter. natsortfiles(C) % Correct numeric order and dictionary sort. ```
```ans = 'test.m' 'test1.m' 'test10-old.m' 'test10.m' 'test2.m' ans = 'test.m' 'test1.m' 'test2.m' 'test10-old.m' 'test10.m' ans = 'test.m' 'test1.m' 'test2.m' 'test10.m' 'test10-old.m' ```

## Explanation: Filepaths

For the same reason, filepaths are split at each file path separator character (both / and \ are considered to be file path separators) and every level of directory names are sorted separately. This ensures that the directory names are sorted with a dictionary sort and that any numbers are taken into consideration:

```D = {'A2-old\test.m';'A10\test.m';'A2\test.m';'AXarchive.zip';'A1\test.m'}; sort(D) % Wrong numeric order, and '-' sorts before '\': natsort(D) % correct numeric order, but longer before shorter. natsortfiles(D) % correct numeric order and dictionary sort. ```
```ans = 'A10\test.m' 'A1\test.m' 'A2-old\test.m' 'A2\test.m' 'AXarchive.zip' ans = 'A1\test.m' 'A2-old\test.m' 'A2\test.m' 'A10\test.m' 'AXarchive.zip' ans = 'AXarchive.zip' 'A1\test.m' 'A2\test.m' 'A2-old\test.m' 'A10\test.m' ```

## Regular Expression: Decimal Numbers, E-notation, +/- Sign.

NATSORTFILES is a wrapper for NATSORT, which means all of NATSORT's options are also supported. In particular the number recognition can be customized to detect numbers with decimal digits, E-notation, a +/- sign, or other specific features. This detection is defined by providing an appropriate regular expression: see NATSORT for details and examples.

```E = {'test24.csv','test1.8.csv','test5.csv','test3.3.csv','test12.csv'}; natsortfiles(E,'\d+(\.\d+)?') ```
```ans = 'test1.8.csv' 'test3.3.csv' 'test5.csv' 'test12.csv' 'test24.csv' ```