Note: This page has been translated by MathWorks. Click here to see

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Regression models describe the relationship between a *dependent
variable*, *y*, and *independent
variable* or variables, *X*. The dependent
variable is also called the *response variable*.
Independent variables are also called *explanatory* or *predictor
variable**s*. Continuous predictor variables
might be called *covariates*, whereas categorical
predictor variables might be also referred to as *factors*.
The matrix, *X*, of observations on predictor variables
is usually called the *design matrix*.

A multiple linear regression model is

$${y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{i1}+{\beta}_{2}{X}_{i2}+\cdots +{\beta}_{p}{X}_{ip}+{\epsilon}_{i},\text{\hspace{1em}}i=1,\cdots ,n,$$

*y*is the_{i}*i*th response.*β*_{k}is the*k*th coefficient, where*β*_{0}is the constant term in the model. Sometimes, design matrices might include information about the constant term. However,`fitlm`

or`stepwiselm`

by default includes a constant term in the model, so you must not enter a column of 1s into your design matrix*X*.*X*is the_{ij}*i*th observation on the*j*th predictor variable,*j*= 1, ...,*p*.*ε*is the_{i}*i*th noise term, that is, random error.

In general, a linear regression model can be a model of the form

$${y}_{i}={\beta}_{0}+{\displaystyle \sum _{k=1}^{K}{\beta}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)}+{\epsilon}_{i},\text{\hspace{1em}}i=1,\cdots ,n,$$

Some examples of linear models are:

$$\begin{array}{l}{y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{1i}+{\beta}_{2}{X}_{2i}+{\beta}_{3}{X}_{3i}+{\epsilon}_{i}\\ {y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{1i}+{\beta}_{2}{X}_{2i}+{\beta}_{3}{X}_{1i}^{3}+{\beta}_{4}{X}_{2i}^{2}+{\epsilon}_{i}\\ {y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{1i}+{\beta}_{2}{X}_{2i}+{\beta}_{3}{X}_{1i}{X}_{2i}+{\beta}_{4}\mathrm{log}{X}_{3i}+{\epsilon}_{i}\end{array}$$

The following, however, are not linear models since they are
not linear in the unknown coefficients, *β*_{k}.

$$\begin{array}{l}\mathrm{log}{y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{1i}+{\beta}_{2}{X}_{2i}+{\epsilon}_{i}\\ {y}_{i}={\beta}_{0}+{\beta}_{1}{X}_{1i}+\frac{1}{{\beta}_{2}{X}_{2i}}+{e}^{{\beta}_{3}{X}_{1i}{X}_{2i}}+{\epsilon}_{i}\end{array}$$

The usual assumptions for linear regression models are:

The noise terms,

*ε*, are uncorrelated._{i}The noise terms,

*ε*_{i}, have independent and identical normal distributions with mean zero and constant variance, σ^{2}. Thus$$\begin{array}{l}E\left({y}_{i}\right)=E\left({\displaystyle \sum _{k=0}^{K}{\beta}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)}+{\epsilon}_{i}\right)\\ \text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}={\displaystyle \sum _{k=0}^{K}{\beta}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)}+E\left({\epsilon}_{i}\right)\\ \text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}={\displaystyle \sum _{k=0}^{K}{\beta}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)}\end{array}$$

and

$$V\left({y}_{i}\right)=V\left({\displaystyle \sum _{k=0}^{K}{\beta}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)}+{\epsilon}_{i}\right)=V\left({\epsilon}_{i}\right)={\sigma}^{2}$$

So the variance of

*y*_{i}is the same for all levels of*X*_{ij}.The responses

*y*_{i}are uncorrelated.

The fitted linear function is

$${\widehat{y}}_{i}={\displaystyle \sum _{k=0}^{K}{b}_{k}{f}_{k}\left({X}_{i1},{X}_{i2},\cdots ,{X}_{ip}\right)},\text{\hspace{1em}}i=1,\cdots ,n,$$

In a linear regression model of the form *y* = *β*_{1}*X*_{1} +* β*_{2}*X*_{2} +
... + *β*_{p}X_{p},
the coefficient *β*_{k} expresses
the impact of a one-unit change in predictor variable, *X _{j}*,
on the mean of the response, E(

[1] Neter, J., M. H. Kutner, C. J. Nachtsheim, and W. Wasserman. Applied Linear Statistical Models. IRWIN, The McGraw-Hill Companies, Inc., 1996.

[2] Seber, G. A. F. Linear Regression Analysis. Wiley Series in Probability and Mathematical Statistics. John Wiley and Sons, Inc., 1977.

`LinearModel`

| `fitlm`

| `stepwiselm`

Was this topic helpful?