Linear regression model class
An object comprising training data, model description, diagnostic
information, and fitted coefficients for a linear regression. Predict
model responses with the predict
or feval
methods.
or mdl
=
fitlm(tbl
)
create
a linear model of a table or dataset array mdl
=
fitlm(X
,y
)tbl
,
or of the responses y
to a data matrix X
.
For details, see fitlm
.
or mdl
= stepwiselm(tbl
)
create
a linear model of a table or dataset array mdl
=
stepwiselm(X
,y
)tbl
,
or of the responses y
to a data matrix X
,
with unimportant predictors excluded. For details, see stepwiselm
.

Covariance matrix of coefficient estimates.  

Cell array of strings containing a label for each coefficient.  

Coefficient values stored as a table.
To obtain any of these columns as a vector, index into the property
using dot notation. For example, in beta = mdl.Coefficients.Estimate Use  

Degrees of freedom for error (residuals), equal to the number of observations minus the number of estimated coefficients.  

Table with the same number of rows as the input data (
Rows not used in the fit because of missing values (in Rows not used in the fit because of excluded values (in  

Predicted response to the input data by using the model. Use  

Object containing information about the model.  

Log likelihood of the model distribution at the response values, with mean fitted from the model, and other parameters estimated as part of the model fit.  

To obtain any of these values as a scalar, index into the property
using dot notation. For example, in a model aic = mdl.ModelCriterion.AIC  

Mean squared error (residuals),  

Number of coefficients in the model, a positive integer.  

Number of estimated coefficients in the model, a positive integer.  

Number of observations the fitting function used in fitting.
This is the number of observations supplied in the original table,
dataset, or matrix, minus any excluded rows (set with the  

Number of variables  

Number of variables in the data.  

Table with the same number of rows as the input data (
 

Cell array of strings containing the names of the observations used in the fit.
 

Cell array of strings, the names of the predictors used in fitting the model.  

Table of residuals, with one row for each observation and these variables.
To obtain any of these columns as a vector, index into the property
using dot notation. For example, in a model r = mdl.Residuals.Raw Rows not used in the fit because of missing values (in Rows not used in the fit because of excluded values (in
 

String giving naming the response variable.  

Root mean squared error (residuals),  

Structure that is empty unless
 

Proportion of total sum of squares explained by the model. The
ordinary Rsquared value relates to the
For a linear or nonlinear model,
For a generalized linear model,
To obtain any of these values as a scalar, index into the property
using dot notation. For example, the adjusted Rsquared value in r2 = mdl.Rsquared.Adjusted  

Sum of squared errors (residuals). The Pythagorean theorem implies
 

Regression sum of squares, the sum of squared deviations of the fitted values from their mean. The Pythagorean theorem implies
 

Total sum of squares, the sum of squared deviations of The Pythagorean theorem implies
 

Structure that is empty unless
The
 

Table containing metadata about
 

Cell array of strings containing names of the variables in the fit.
 

Table containing the data, both observations and responses,
that the fitting function used to construct the fit. If the fit is
based on a table or dataset array, 
addTerms  Add terms to linear regression model 
anova  Analysis of variance for linear model 
coefCI  Confidence intervals of coefficient estimates of linear model 
coefTest  Linear hypothesis test on linear regression model coefficients 
disp  Display linear regression model 
dwtest  DurbinWatson test of linear model 
feval  Evaluate linear regression model prediction 
fit  Create linear regression model 
plot  Scatter plot or added variable plot of linear model 
plotAdded  Added variable plot or leverage plot for linear model 
plotAdjustedResponse  Adjusted response plot for linear regression model 
plotDiagnostics  Plot diagnostics of linear regression model 
plotEffects  Plot main effects of each predictor in linear regression model 
plotInteraction  Plot interaction effects of two predictors in linear regression model 
plotResiduals  Plot residuals of linear regression model 
plotSlice  Plot of slices through fitted linear regression surface 
predict  Predict response of linear regression model 
random  Simulate responses for linear regression model 
removeTerms  Remove terms from linear model 
step  Improve linear regression model by adding or removing terms 
stepwise  Create linear regression model by stepwise regression 
Value. To learn how value classes affect copy operations, see Copying Objects in the MATLAB^{®} documentation.
The hat matrix H is defined in terms of the data matrix X:
H = X(X^{T}X)^{–1}X^{T}.
The diagonal elements H_{ii} satisfy
$$\begin{array}{l}0\le {h}_{ii}\le 1\\ {\displaystyle \sum _{i=1}^{n}{h}_{ii}}=p,\end{array}$$
where n is the number of observations (rows of X), and p is the number of coefficients in the regression model.
The leverage of observation i is the value of the ith diagonal term, h_{ii}, of the hat matrix H. Because the sum of the leverage values is p (the number of coefficients in the regression model), an observation i can be considered to be an outlier if its leverage substantially exceeds p/n, where n is the number of observations.
Cook's distance is the scaled change in fitted values.
Each element in CooksDistance
is the normalized
change in the vector of coefficients due to the deletion of an observation.
The Cook's distance, D_{i},
of observation i is
$${D}_{i}=\frac{{\displaystyle \sum _{j=1}^{n}{\left({\widehat{y}}_{j}{\widehat{y}}_{j(i)}\right)}^{2}}}{p\text{\hspace{0.17em}}MSE},$$
where
$${\widehat{y}}_{j}$$ is the jth fitted response value.
$${\widehat{y}}_{j(i)}$$ is the jth fitted response value, where the fit does not include observation i.
MSE is the mean squared error.
p is the number of coefficients in the regression model.
Cook's distance is algebraically equivalent to the following expression:
$${D}_{i}=\frac{{r}_{i}^{2}}{p\text{\hspace{0.17em}}MSE}\left(\frac{{h}_{ii}}{{\left(1{h}_{ii}\right)}^{2}}\right),$$
where r_{i} is the ith residual, and h_{ii} is the ith leverage value.
CooksDistance
is an nby1
column vector in the Diagnostics
table of the LinearModel
object.
The main fitting algorithm is QR decomposition. For robust fitting,
the algorithm is robustfit
.
To remove redundant predictors in linear regression using lasso
or elastic net, use the lasso
function.
To regularize a regression with correlated terms using ridge
regression, use the ridge
or lasso
functions.
To regularize a regression with correlated terms using partial
least squares, use the plsregress
function.