nth_element

Version 1.9.0.0 (10.4 KB) by Peter Li
MEX wrap of C++ nth_element. Plus fast_median, a faster median function. In-place and parallel.
2.3K Downloads
Updated 15 Nov 2013

View License

DO NOT USE 0.85 with OpenMP support; it has bugs; advise upgrade to 0.86.

C++ std::nth_element is an efficient algorithm for selecting a ranked element from a vector of data. Typically it is implemented as a variant of quickselect, AKA Hoare's Selection Algorithm. The mex-file in this package will run nth_element over a 2D array column-wise. See C++ documentation and http://en.wikipedia.org/wiki/Selection_algorithm for more information.

I have added (as of v0.84) the ability to operate on data in-place. This potentially saves an array copy so can be significantly more efficient. I see about another 2x speedup in my tests on some machines (seems largely dependent on chipset/memory bandwidth). Because it uses undocumented mex calls, this may break on future versions of Matlab. Please give it a try and send feedback.

As of version 0.86, you can also get back the indices for the partitioning by requesting a second return value. This is slower than running just the data partition though. Only for nth_element for the moment. Should be easy to implement for fast_median when I find time / get more requests.

I have also added (as of v0.86) OpenMP pragmas to support operating on multiple columns in parallel. See nth_element.m or below for instructions on compiling the mex files with OpenMP support.

One example calculation based on nth_element is also included, a mex-file for fast_median. In my benchmarks, fast_median is roughly twice as fast as MatLab's native median function. MatLab's median relies on sort, but sorting the entire input data to get the median is inefficient. Theoretical average complexity of fast_median is O(n), compared to best case complexity of O(n log n) for a full sort based approach.

Median calculations are particularly important in robust statistics, for example the median absolute deviation (MAD).

To install, unpack the zip, go to the directory from MatLab, and run:
> mex nth_element.cpp
> mex fast_median.cpp

To get parallel processing of independent columns, you need to compile with OpenMP support. See your compiler instructions. With GCC, e.g., you would do:
> mex nth_element.cpp CXXFLAGS="\$CXXFLAGS -fopenmp" LDFLAGS="\$LDFLAGS -fopenmp"

Then put the resulting binaries on your MatLab path.

Cite As

Peter Li (2024). nth_element (https://www.mathworks.com/matlabcentral/fileexchange/29453-nth_element), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R2010a
Compatible with any release
Platform Compatibility
Windows macOS Linux
Categories
Find more on Large Files and Big Data in Help Center and MATLAB Answers

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes
1.9.0.0

Bugfix for 0xN empty inputs.

1.8.0.0

Bug fixes

1.7.0.0

Corrected OpenMP compile help in nth_element.m and added similar info in description here.

1.5.0.0

Added OpenMP pragmas to support operating on multiple columns in parallel. Tried to clean up includes without complicating mex build process.

1.3.0.0

Added in-place editing.

1.2.0.0

Fixed void pointer return to void so that fancier C++ compilers will not complain about missing return type.

1.1.0.0

Just updating licenses to match FreeBSD required by MatLab central. Also added a little documentation to source files.

1.0.0.0