Main Content

Regression is the process of fitting models to data. The models must have numerical responses. For models with categorical responses, see Parametric Classification or Supervised Learning Workflow and Algorithms. The regression process depends on the model. If a model is parametric, regression estimates the parameters from the data. If a model is linear in the parameters, estimation is based on methods from linear algebra that minimize the norm of a residual vector. If a model is nonlinear in the parameters, estimation is based on search methods from optimization that minimize the norm of a residual vector.

This table describes which function to use depending on the type of regression problem.

Model Components | Result of Regression | Function to Use |
---|---|---|

Continuous or categorical predictors, continuous response, linear model | Fitted model coefficients | `fitlm` . See Linear Regression. |

Continuous or categorical predictors, continuous response, linear model of unknown complexity | Fitted model and fitted coefficients | `stepwiselm` . See Stepwise Regression. |

Continuous or categorical predictors, response possibly with restrictions such as nonnegative or integer-valued, generalized linear model | Fitted generalized linear model coefficients | `fitglm` or `stepwiseglm` . See Generalized Linear Models. |

Continuous predictors with a continuous nonlinear response, parametrized nonlinear model | Fitted nonlinear model coefficients | `fitnlm` . See Nonlinear Regression. |

Continuous predictors, continuous response, linear model | Set of models from ridge, lasso, or elastic net regression | `lasso` or `ridge` . See Lasso and Elastic Net or Ridge Regression. |

Correlated continuous predictors, continuous response, linear model | Fitted model and fitted coefficients | `plsregress` . See Partial Least Squares. |

Continuous or categorical predictors, continuous response, unknown model | Nonparametric model | `fitrtree` or `fitrensemble` . |

Categorical predictors only | ANOVA | `anova` , `anova1` , `anova2` , `anovan` . |

Continuous predictors, multivariable response, linear model | Fitted multivariate regression model coefficients | `mvregress` |

Continuous predictors, continuous response, mixed-effects model | Fitted mixed-effects model coefficients | `nlmefit` or `nlmefitsa` . See Mixed-Effects Models. |

There are several Statistics and Machine Learning Toolbox™ functions for performing regression. The following sections describe how to replace calls to older functions to new versions:

`regress`

into `fitlm`

Previous Syntax:

[b,bint,r,rint,stats] = regress(y,X)

where `X`

contains a column of ones.

Current Syntax:

mdl = fitlm(X,y)

where you do not add a column of ones to `X`

.

Equivalent values of the previous outputs:

`b`

—`mdl.Coefficients.Estimate`

`bint`

—`coefCI`

`(mdl)`

`r`

—`mdl.Residuals.Raw`

`rint`

— There is no exact equivalent. Try examining`mdl.Residuals.Studentized`

to find outliers.`stats`

—`mdl`

contains various properties that replace components of`stats`

.

`regstats`

into `fitlm`

Previous Syntax:

stats = regstats(y,X,model,whichstats)

Current Syntax:

mdl = fitlm(X,y,model)

Obtain statistics from the properties and methods of the `LinearModel`

object (`mdl`

). For example, see the `mdl.Diagnostics`

and `mdl.Residuals`

properties.

`robustfit`

into `fitlm`

Previous Syntax:

[b,stats] = robustfit(X,y,wfun,tune,const)

Current Syntax:

mdl = fitlm(X,y,'robust','on') % bisquare

Or to use the * wfun* weight and the

`tune`

opt.RobustWgtFun = 'wfun'; opt.Tune =tune; % optional mdl = fitlm(X,y,'robust',opt)

Obtain statistics from the properties and methods of the `LinearModel`

object (`mdl`

). For example, see the `mdl.Diagnostics`

and `mdl.Residuals`

properties.

`stepwisefit`

into `stepwiselm`

Previous Syntax:

[b,se,pval,inmodel,stats,nextstep,history] = stepwisefit(X,y,Name,Value)

Current Syntax:

mdl = stepwiselm(ds,modelspec,Name,Value)

or

mdl = stepwiselm(X,y,modelspec,Name,Value)

Obtain statistics from the properties and methods of the `LinearModel`

object (`mdl`

). For example, see the `mdl.Diagnostics`

and `mdl.Residuals`

properties.

`glmfit`

into `fitglm`

Previous Syntax:

[b,dev,stats] = glmfit(X,y,distr,param1,val1,...)

Current Syntax:

mdl = fitglm(X,y,distr,...)

Obtain statistics from the properties and methods of the `GeneralizedLinearModel`

object (`mdl`

). For example, the deviance is `mdl.Deviance`

, and to compare `mdl`

against a constant model, use `devianceTest`

`(mdl)`

.

`nlinfit`

into `fitnlm`

Previous Syntax:

[beta,r,J,COVB,mse] = nlinfit(X,y,fun,beta0,options)

Current Syntax:

mdl = fitnlm(X,y,fun,beta0,'Options',options)

Equivalent values of the previous outputs:

`beta`

—`mdl.Coefficients.Estimate`

`r`

—`mdl.Residuals.Raw`

`covb`

—`mdl.CoefficientCovariance`

`mse`

—`mdl.mse`

`mdl`

does not provide the Jacobian (`J`

) output. The primary purpose of `J`

was to pass it into `nlparci`

or `nlpredci`

to obtain confidence intervals for the estimated coefficients (parameters) or predictions. Obtain those confidence intervals as:

parci = coefCI(mdl) [pred,predci] = predict(mdl)