| Statistics Toolbox™ | ![]() |
b = robustfit(X,y)
b = robustfit(X,y,wfun,tune)
b = robustfit(X,y,wfun,tune,const)
[b,stats] = robustfit(...)
b = robustfit(X,y) returns a p-by-1 vector b of coefficient estimates for a robust multilinear regression of the responses in y on the predictors in X. X is an n-by-p matrix of p predictors at each of n observations. y is an n-by-1 vector of observed responses. By default, the algorithm uses iteratively reweighted least squares with a bisquare weighting function.
Note By default, robustfit adds a first column of 1s to X, corresponding to a constant term in the model. Do not enter a column of 1s directly into X. You can change the default behavior of robustfit using the input const, below. |
robustfit treats NaNs in X or y as missing values, and removes them.
b = robustfit(X,y,wfun,tune) specifies a weighting function wfun. tune is a tuning constant that is divided into the residual vector before computing weights.
The weighting function wfun can be any one of the following strings:
| Weight Function | Equation | Default Tuning Constant |
|---|---|---|
| 'andrews' | w = (abs(r)<pi) .* sin(r) ./ r | 1.339 |
| 'bisquare' (default) | w = (abs(r)<1) .* (1 - r.^2).^2 | 4.685 |
| 'cauchy' | w = 1 ./ (1 + r.^2) | 2.385 |
| 'fair' | w = 1 ./ (1 + abs(r)) | 1.400 |
| 'huber' | w = 1 ./ max(1, abs(r)) | 1.345 |
| 'logistic' | w = tanh(r) ./ r | 1.205 |
| 'ols' | Ordinary least squares (no weighting function) | None |
| 'talwar' | w = 1 * (abs(r)<1) | 2.795 |
| 'welsch' | w = exp(-(r.^2)) | 2.985 |
If tune is unspecified, the default value in the table is used. Default tuning constants give coefficient estimates that are approximately 95% as statistically efficient as the ordinary least-squares estimates, provided the response has a normal distribution with no outliers. Decreasing the tuning constant increases the downweight assigned to large residuals; increasing the tuning constant decreases the downweight assigned to large residuals.
The value r in the weight functions is
r = resid/(tune*s*sqrt(1-h))
where resid is the vector of residuals from the previous iteration, h is the vector of leverage values from a least-squares fit, and s is an estimate of the standard deviation of the error term given by
s = MAD/0.6745
Here MAD is the median absolute deviation of the residuals from their median. The constant 0.6745 makes the estimate unbiased for the normal distribution. If there are p columns in X, the smallest p absolute deviations are excluded when computing the median.
You can write your own M-file weight function. The function must take a vector of scaled residuals as input and produce a vector of weights as output. In this case, wfun is specified using a function handle @ (as in @myfun), and the input tune is required.
b = robustfit(X,y,wfun,tune,const) controls whether or not the model will include a constant term. const is 'on' to include the constant term (the default), or 'off' to omit it. When const is 'on', robustfit adds a first column of 1s to X. When const is 'off', robustfit does not alter X.
[b,stats] = robustfit(...) returns the structure stats, whose fields contain diagnostic statistics from the regression. The fields of stats are:
ols_s — Sigma estimate (RMSE) from ordinary least squares
robust_s — Robust estimate of sigma
mad_s — Estimate of sigma computed using the median absolute deviation of the residuals from their median; used for scaling residuals during iterative fitting
s — Final estimate of sigma, the larger of robust_s and a weighted average of ols_s and robust_s
se — Standard error of coefficient estimates
t — Ratio of b to se
p — p-values for t
covb — Estimated covariance matrix for coefficient estimates
coeffcorr — Estimated correlation of coefficient estimates
w — Vector of weights for robust fit
h — Vector of leverage values for least-squares fit
dfe — Degrees of freedom for error
R — R factor in QR decomposition of X
The robustfit function estimates the variance-covariance matrix of the coefficient estimates using inv(X'*X)*stats.s^2. Standard errors and correlations are derived from this estimate.
Generate data with the trend y = 10-2*x, then change one value to simulate an outlier:
x = (1:10)'; y = 10 - 2*x + randn(10,1); y(10) = 0;
Use both ordinary least squares and robust regression to estimate a straight-line fit:
bls = regress(y,[ones(10,1) x]) bls = 7.2481 -1.3208 brob = robustfit(x,y) brob = 9.1063 -1.8231
A scatter plot of the data together with the fits shows that the robust fit is less influenced by the outlier than the least-squares fit:
scatter(x,y,'filled'); grid on; hold on
plot(x,bls(1)+bls(2)*x,'r','LineWidth',2);
plot(x,brob(1)+brob(2)*x,'g','LineWidth',2)
legend('Data','Ordinary Least Squares','Robust Regression')

[1] DuMouchel, W. H., F. L. O'Brien, "Integrating a Robust Option into a Multiple Regression Computing Environment," Computer Science and Statistics: Proceedings of the 21st Symposium on the Interface, Alexandria, VA, American Statistical Association, 1989.
[2] Holland, P. W., R. E. Welsch, "Robust Regression Using Iteratively Reweighted Least-Squares," Communications in Statistics: Theory and Methods, A6, 1977, pp. 813-827.
[3] Huber, P. J., Robust Statistics, Wiley, 1981.
[4] Street, J. O., R. J. Carroll, D. Ruppert, "A Note on Computing Robust Regression Estimates via Iteratively Reweighted Least Squares," The American Statistician, 42, 1988, pp. 152-154.
![]() | robustdemo | rotatefactors | ![]() |
| © 1984-2008- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |