Lasso is a regularization technique. Use lasso
to:
Reduce the number of predictors in a regression model.
Identify important predictors.
Select among redundant predictors.
Produce shrinkage estimates with potentially lower predictive errors than ordinary least squares.
Elastic net is a related technique. Use elastic net when you have several highly correlated variables. lasso
provides elastic net regularization when you set the Alpha
name-value pair to a number strictly between 0
and 1
.
See Lasso and Elastic Net Details.
For lasso regularization of regression ensembles, see regularize
.
Lasso is a regularization technique for performing linear regression. Lasso includes a penalty term that constrains the size of the estimated coefficients. Therefore, it resembles ridge regression. Lasso is a shrinkage estimator: it generates coefficient estimates that are biased to be small. Nevertheless, a lasso estimator can have smaller mean squared error than an ordinary least-squares estimator when you apply it to new data.
Unlike ridge regression, as the penalty term increases, lasso sets more coefficients to zero. This means that the lasso estimator is a smaller model, with fewer predictors. As such, lasso is an alternative to stepwise regression and other model selection and dimensionality reduction techniques.
Elastic net is a related technique. Elastic net is a hybrid of ridge regression and lasso regularization. Like lasso, elastic net can generate reduced models by generating zero-valued coefficients. Empirical studies have suggested that the elastic net technique can outperform lasso on data with highly correlated predictors.
The lasso technique solves this regularization problem. For a given value of λ, a nonnegative parameter, lasso
solves the problem
N is the number of observations.
yi is the response at observation i.
xi is data, a vector of p values at observation i.
λ is a positive regularization parameter corresponding to one value of Lambda
.
The parameters β0 and β are scalar and p-vector respectively.
As λ increases, the number of nonzero components of β decreases.
The lasso problem involves the L1 norm of β, as contrasted with the elastic net algorithm.
The elastic net technique solves this regularization problem. For an α strictly between 0 and 1, and a nonnegative λ, elastic net solves the problem
where
Elastic net is the same as lasso when α = 1. As α shrinks toward 0, elastic net approaches ridge
regression. For other values of α, the penalty term Pα(β) interpolates between the L1 norm of β and the squared L2 norm of β.
[1] Tibshirani, R. "Regression shrinkage and selection via the lasso." Journal of the Royal Statistical Society, Series B, Vol 58, No. 1, pp. 267–288, 1996.
[2] Zou, H. and T. Hastie. "Regularization and variable selection via the elastic net." Journal of the Royal Statistical Society, Series B, Vol. 67, No. 2, pp. 301–320, 2005.
[3] Friedman, J., R. Tibshirani, and T. Hastie. "Regularization paths for generalized linear models via coordinate descent." Journal of Statistical Software, Vol 33, No. 1, 2010. https://www.jstatsoft.org/v33/i01
[4] Hastie, T., R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, 2nd edition. Springer, New York, 2008.
fitrlinear
| lasso
| lassoglm
| lassoPlot
| ridge