Lasso or elastic net regularization for generalized linear model regression
B = lassoglm(X,Y)
B = lassoglm(X,Y,distr)
B = lassoglm(X,Y,distr,Name,Value)
[B,FitInfo] = lassoglm(___)
penalized maximum-likelihood fitted coefficients for a generalized
linear model of the response
B = lassoglm(
Y to the data matrix
The values in
Y are assumed to have a Gaussian
Numeric matrix with
Distributional family for the nonsystematic variation in the responses. Choices:
comma-separated pairs of
the argument name and
Value is the corresponding value.
Name must appear inside single quotes (
' '). You can
specify several name and value pair arguments in any order as
Scalar value from
Maximum number of nonzero coefficients in the model.
Vector of nonnegative
Default: Geometric sequence of
Positive scalar, the ratio of the smallest to the largest
If you set
Specify the mapping between the mean µ of the response and the linear predictor Xb.
Maximum number of iterations allowed, specified as positive
integer. If the algorithm executes
Positive integer, the number of Monte Carlo repetitions for cross-validation.
Positive integer, the number of
Numeric vector with the same number of rows as
Structure that specifies whether to cross-validate in parallel,
and specifies the random stream or streams. Create the
Cell array of character vectors representing names of the predictor
variables, in the order in which they appear in
Convergence threshold for the coordinate descent algorithm (see
Friedman, Tibshirani, and Hastie ).
The algorithm terminates when successive estimates of the coefficient
vector differ in the L2 norm
by a relative amount less than
Boolean value specifying whether
Observation weights, a nonnegative vector of length
Structure containing information about the model fits.
If you set the
Construct data from a Poisson model, and identify the important predictors using
Create data with 20 predictors, and Poisson responses using just three of the predictors plus a constant.
rng default % For reproducibility X = randn(100,20); mu = exp(X(:,[5 10 15])*[.4;.2;.3] + 1); y = poissrnd(mu);
Construct a cross-validated lasso regularization of a Poisson regression model of the data.
[B, FitInfo] = lassoglm(X,y,'poisson','CV',10);
Examine the cross-validation plot to see the effect of the
Lambda regularization parameter.
The green circle and dashed line locate the
Lambda with minimum cross-validation error. The blue circle and dashed line locate the point with minimum cross-validation error plus one standard deviation.
Find the nonzero model coefficients corresponding to the two identified points.
minpts = find(B(:,FitInfo.IndexMinDeviance))
minpts = 3 5 6 10 11 15 16
min1pts = find(B(:,FitInfo.Index1SE))
min1pts = 5 10 15
The coefficients from the minimum-plus-one standard error point are exactly those coefficients used to create the data.
A link function f(μ) maps a distribution with mean μ to a linear model with data X and coefficient vector b using the formula
f(μ) = Xb.
Find the formulas for the link functions in the
pair description. The following table lists the link functions that
are typically used for each distribution.
|Distributional Family||Default Link Function||Other Typical Link Functions|
For a nonnegative value of λ,
The function Deviance in this equation is the deviance
of the model fit to the responses using intercept β0 and
predictor coefficients β. The formula for
Deviance depends on the
distr parameter you supply
lassoglm. Minimizing the λ-penalized
deviance is equivalent to maximizing the λ-penalized
N is the number of observations.
λ is a nonnegative regularization
parameter corresponding to one value of
Parameters β0 and β are a scalar and a vector of length p, respectively.
As λ increases, the number of nonzero components of β decreases.
The lasso problem involves the L1 norm of β, as contrasted with the elastic net algorithm.
For an α strictly between 0 and 1, and a nonnegative λ, elastic net solves the problem
Elastic net is the same as lasso when α = 1. For other values of α,
the penalty term Pα(β)
interpolates between the L1 norm
of β and the squared L2 norm
of β. As α shrinks
toward 0, elastic net approaches
 Tibshirani, R. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society. Series B, Vol. 58, No. 1, 1996, pp. 267–288.
 Zou, H. and T. Hastie. “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society. Series B, Vol. 67, No. 2, 2005, pp. 301–320.
 Friedman, J., R. Tibshirani, and T. Hastie.
“Regularization Paths for Generalized Linear Models via Coordinate
Descent.” Journal of Statistical Software. Vol.
33, No. 1, 2010.
 Hastie, T., R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. 2nd edition. New York: Springer, 2008.
 Dobson, A. J. An Introduction to Generalized Linear Models. 2nd edition. New York: Chapman & Hall/CRC Press, 2002.
 McCullagh, P., and J. A. Nelder. Generalized Linear Models. 2nd edition. New York: Chapman & Hall/CRC Press, 1989.
 Collett, D. Modelling Binary Data, 2nd edition. New York: Chapman & Hall/CRC Press, 2003.