MATLAB News & Notes - December 2004
Programming Patterns
An Adventure of Sorts–Behind the Scenes of a MATLAB Upgrade
by Bill McKeeman and Loren Shure
While choosing how to upgrade sort for
MATLAB 7, we used MATLAB to prototype sort
algorithms and to add the new sort direction
qualifiers 'ascend' and 'descend'. We translated
the prototype from MATLAB into C++ to add
this latest version of sort to Release 14—and
found a few surprises in the process.
Algorithms in MathWorks products and user applications use the
MATLAB sort function for tasks ranging from alphabetizing lists to
choosing significant singular values.
([u]int8/16/32/64); plus single, double, char, and logical.
Single and double may be real or complex; double and logical can be
sparse. Full n-dimensional (ND) arrays can be sorted along any
selected dimension, in either ascending or descending order. sort
can also return the permutation vector relating the input values to the
outputs, with call syntax ranging from the most simple
r = sort(v);
to the most complete
[r,i] = sort(v, dim, dirflag);
The most readily available upgrade would use qsort from the C library,
but qsort was too specific for the sort tasks that MATLAB requires.
The C library qsort has a hard-coded stride of 1 (data is assumed
to be arranged in contiguous memory). MATLAB also needs to sort
data containing NaN values, which qsort does not accommodate.
The available C qsort implementations share a common characteristic:
to achieve type independence, data access is indirect (via void* pointers)
and the user supplies a comparison function. As a result, general-purpose
sort algorithms written in C are inelegant and somewhat slow.
Selecting an Algorithm
Sorting experts recommend a three-level algorithm, insert/quick/bin, where each level calls the one to its left if the size of the array being sorted passes below some experimentally discovered cutoff. When we updatedsort, we expected to require
four levels (insert/quick/bin/quick). To achieve speed proportional
to the size of the data (order N) bin sort requires as much auxiliary
storage for the bins as it does for the input data. Because quicksort
manipulates the data in place, it can sort larger arrays, albeit somewhat
more slowly. (Hoare’s 1962 quick sort algorithm has undergone
a generation of changes, particularly to improve performance
when the data is not random.)Resources
Bentley, J. L. and M. D. McIlroy, 1993. Engineering a sort function. Software—Practice and Experience 23, 11, 1249--1265.
Knuth, D. E. 1998. The Art of Computer Programming, Volume 3: Sorting and Searching, 2nd ed. Boston: Addison-Wesley.
Cormen et al. 2001. Introduction to Algorithms, 2nd ed. Columbus: McGraw-Hill.
Prototyping Alternative sort Algorithms
We explored speed and memory tradeoffs by prototyping the basic algorithms in M. The prototypes work only on vectors. (Full ND arrays are discussed in Extending Vector Sorts to N-Dimensions later in this article.)
function v = viSort(v)
for i=1:numel(v)-1
[m, k] = min(v(i:end));
v(i+1:i+k-1) = v(i:i+k-2)
v(i) = m;
end
% vector insert
% make room
% place min
function v = vqSort(v)
if numel(v) > 1
piv = median(v);
lss = v(v < piv);
eql = v(v == piv);
gtr = v(v > piv);
v = [vqSort(lss), eql, vqSort(gtr)];
end
% vector quick sort
% divide
% and
% conquer
function v = vbSort(v)
n = numel(v);
if n > 1
mn = min(v);
mx = max(v);
binAvg = 200;
binCt = ceil(n/binAvg);
slope = binCt/(mx-mn);
binNo = floor(slope.*(v-mn))+1;
bins = cell(1, binCt+1);
for i=1:binCt+1
bins{i} = vqSort(v(binNo==i));
end
v = [bins{:}];
end
% vector bin sort
% experimental
% assign bin
% allocate bins
% reassemble
As we planned to have vqSort call viSort when there were few
enough data points, we modified the quick sort design.
function v = vqSort(v)
if numel(v) < 32
v = viSort(v);
else
piv = median(v);
lss = v(v < piv);
eql = v(v == piv);
gtr = v(v > piv);
v = [vqSort(lss), eql, vqSort(gtr)];
end
% quick sort
% experimental
% need approximation
We translated the M-code vector sorts into C in-place sorts so that
we could test our four-level design hypothesis. The results surprised
us. For vectors of random data on Intel x86 hardware, the
most modern quick sorts were slower than the already-implemented
quick sort in MATLAB! Algorithmic changes made to the modern
sorts to ensure good behavior on less-than-random input were
costly, while the existing MATLAB sort cleverly uses its pivot-picking
approximation to do some sorting work.
We finally decided that best performance on random data was more useful than adequate performance on all data, and kept the existing MATLAB quick sort. (Performance results depend somewhat on the underlying hardware.)
We took into account that the C implementation of bin sort needs
three passes: one pass to decide on the number of bins, one to
bin the data, and one to call qsort on each bin. The results are
left in place, conveniently ordered for output. We can justify this
considerable overhead because it results in an order N, instead of
NlogN, run-time algorithm.
Another surprise! We experimented to find a cutover from quick sort to bin sort, but never found it. Our best effort at a C-implemented bin sort was slower than our C-implemented quick sort, even on datasets containing up to 10 million elements. As a result, we abbreviated our development plan to a two-stage sort: insert/quick. Both these algorithms already have satisfactory implementations in MATLAB.
Extending Double Sorts to New Types: Plus NaN and Inf
Next, we extended the double sorts to the sortable MATLAB types.
We recoded the insert/quick sort pair as C++ template functions
parameterized by the underlying MATLAB types.
MATLAB sorts NaN values high, beyond all other single/double values.
Because NaN and Inf need to be dealt with only for the real types, we
implemented a function (called a jacket) to handle the cases for single
and double. The jacket calls the usual sort routine on the finite values
and handles the nonfinite values correctly for MATLAB. This jacket’s
sorting prepass counts the NaNs and replaces them with +Inf. After
the sort, the jacket overwrites an appropriate number of +Inf values
with NaNs at the top end of the sorted vector. (Had we still been doing
bin sort, this jacket would also have had to count and remove Inf
values, as the bin selection algorithm depends on finite values.)
function v = vSort(v)
if isfloat(v)
nans = isnan(v);
v(nans) = inf;
v = vqSort(v);
v(end-sum(nans)+1:end) = nan;
else
v = vqSort(v);
end
% sort NaN jacket
Extending Vector Sorts to N Dimensions
We extended the vector sorts to sort along a user-selected dimension of an ND array. Array data is laid out in memory in column-major order: elements in a column are adjacent in memory. Elements along another dimension are separated by a constant distance (the so-called stride, usually larger than 1). We prototyped in M first
function v = strideSort(v, d)
s = size(v);
stride = prod(s(1:d-1));
for i = 1 : stride*s(d) : numel(v)
for j = i : i+stride-1
idx = j : stride : j+stride*s(d)-1;
v(idx) = vSort(v(idx));
end
end
% stride jacket
% #of elements
We recoded the C++ in-place insert/quick sort pair to step by a stride, and then wrote a second level of C++ jacket to call them.
Extending Array Sorts to Permutation Vectors
We extended the array sorts to provide the permutation vectors,
which required that the sort be stable, (i.e., sorting may not reorder
equal values). The algorithms viSort, vqSort, and vbSort are
stable, but our C++ implementations were not. Without requesting
the permutation vector, you cannot tell whether the sort is unstable.
Rather than recode the insert/quick sorts, we chose to make a post-sort
pass over the sorted data, detect runs of equal values, and then
sort the corresponding subvector in the parallel index vector. Any
NaNs presented a special problem because they had been interspersed
with the Inf values. We found the original indices of the
NaNs and Infs by making a pass over the original unsorted data.
Extending Array Sorts to Complex Values
The MATLAB algorithm for complex values first sorts by absolute value, then sorts by phase angle, and finally sorts to ensure stability for duplicate values. We decided to leave the MATLAB C code for sorting complex double values in place and call it from the jackets described above.
Implementing Ascend and Descend Flags
We passed the direction flags 'ascend' and 'descend' to the
jackets. If the flag is 'descend', the output is reversed just before
being returned. After the reversal, we ran the index vector sorts to
restore stability.
Testing the Sorts
We tested for the following criteria:- The sorts should accept all valid inputs (12 types of data, shapes from empty arrays to large ND ensembles, and valid dimension and direction flags) and otherwise fail with an error message.
- If requested, the sorts should produce the permutation vector.
- The sorts should “run as fast as C.” Suppose T means “type” and S means “shape.” Then we guarantee the following.
| Statement | Guarantees |
|---|---|
r = sort(v, dim, flags) |
T(r)=T(v) ^ S(r)=S(v) |
[r,i] = sort(v, dim, flags) |
T(r)=T(v) ^ T(i)=double ^ S(r)=S(i)=S(v) |
We checked that in each case the outputs were sorted and the permutation
vectors applied to the inputs, given the outputs. We
coded all the test routines in M, using the MATLAB functions
class and size.
To test performance, we raced our algorithm against the C standard
library qsort and the C++ Standard Template Library sort for
simple cases. The MATLAB sort function is about five times faster
than C qsort and a little faster than the C++ STL sort.
MATLAB provided a productive environment to explore enhancements
to the sort algorithm. Prototyping the possibilities in
MATLAB before recoding the algorithms in C++ helped us quickly
understand our programming choices. We can still use the prototype
M versions of sort as test components when we verify the
MATLAB sort function’s behavior. Using MATLAB to upgrade
MATLAB enabled us to deliver a higher performance sort, and
we can use M-code to explain how we did it.
find
Similar to the MATLAB function sort with options 'ascend'
and 'descend’, the function find has added functionality in
MATLAB 7 that lets you find the 'first' or 'last' elements that
meet some criterion. For example, if you want to find the first
element in array X less than 42, usefind(X<42,1,'first')This operation stops as soon as the first element meeting the condition is encountered, which can result in significant time savings if the array is large and the value you find happens to be far from the end.
If, instead, you want to find the final 10 elements > 17, then use
find(X>17,10,'last')Again, the operation returns as soon as 10 values are found. If fewer than the requested number is found after the entire array is scanned, the return vector will have fewer than the requested number of elements.
Before
sort had the option to return items in descending order,
you had to perform the sort and then index from the end. With
MATLAB 7, sort has the new 'descend' option, and find has
the new 'last' option to combine the find with the indexing
into a single function call.
Store
