File Exchange

image thumbnail

Theil-Sen Robust Linear Regression

version 1.2 (2.53 KB) by

Quickly perform linear regression that is robust to outliers.

15 Downloads

Updated

View License

This function executes a fast version of the non-parametric Theil-Sen robust linear regression algorithm by finding the median slope between all pairwise combinations of points in a given data set.
For my application I needed to run a robust regression on large data sets (many thousands of points), but the implementations I found on the File Exchange were far too slow (see figure). This code is substantially faster, and for large data sets can be two orders of magnitude faster than those currently available.

Comments and Ratings (6)

T C

T C (view profile)

I believe the 'nanmedian' function requires a toolbox?

You can make this function toolbox free if you use the following code

b = median(D(~isnan(D)));

Replace D with whatever the parameter of nanmedium was previously.

Great script btw!

Zachary Danziger

@Ian: Absolutely right, I have updated the submission with your suggestion.

Ian Craig

Fantastic script, but you can squeeze even more speed out of this. When you calculate the slopes, you are building a symmetric matrix and calculating each slope twice, when you only need to do it once. Change the calculation of C to:

C(i,i:end) = (data(i,2)-data(i:end,2))./(data(i,1) - data(i:end,1));

(That still calculates unnecessary 'NaN' values for the diagonal, but the indexing is easy.)

Using timeit to time both functions with various values of n up to 8192, I get an almost 2x speedup (not surprising because I'm only doing half the calculations). For 8192 points on my machine, the time goes from 11.278s to 6.3985s.

Zachary Danziger

@Dillon: You are absolutely right, the legend entries are swapped.

Dillon Amaya

There is a small error in the example code you have in the documentation. I believe you mislabeled the Least Squares line and the the Theil-Sen line. LS should be blue and TS should be red.

Levi Golston

This is the best of the three Theil-Sen scripts. Can return both m and b, faster by 30x, and handles multi-dimensional input.

Updates

1.2

- updated help
- speed increase for 2D case

1.2

Updated code to function with higher dimensional data sets.
Included example in documentation.

1.1

Updated units on associated figure.

MATLAB Release
MATLAB 8.5 (R2015a)
Acknowledgements

Inspired by: Theil–Sen estimator

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video