Improve linear regression model by adding or removing terms
mdl1 = step(mdl)
mdl1 = step(mdl,Name,Value)
You can use
step only if
mdl.Robust = . This holds when you create
fitlm having the
pair set to the default
an improved linear model using additional options specified by one
mdl1 = step(
Name,Value pair arguments. For example,
you can specify the criterion to use to add or remove terms.
comma-separated pairs of
the argument name and
Value is the corresponding value.
Name must appear inside single quotes (
' '). You can
specify several name and value pair arguments in any order as
'Criterion'— Criterion for selecting terms to add or remove
Criterion for selecting terms to add or remove, specified as
the comma-separated pair consisting of
one of the following.
|p-value for F test|
|Change in AIC|
|Change in BIC|
|Increase in R-squared|
|Increase in adjusted R-squared|
Fit a linear model to car data. Use
step to evaluate whether a quadratic model improves the fit quality.
carsmall data, and create a table using weight and model year predictors with MPG response.
load carsmall tbl = table(MPG,Weight); tbl.Year = categorical(Model_Year);
Make a linear model of
MPG as a function of
mdl = fitlm(tbl,'MPG ~ Year + Weight')
mdl = Linear regression model: MPG ~ 1 + Weight + Year Estimated Coefficients: Estimate SE tStat pValue __________ __________ _______ __________ (Intercept) 40.11 1.5418 26.016 1.2024e-43 Weight -0.0066475 0.00042802 -15.531 3.3639e-27 Year_76 1.9291 0.74761 2.5804 0.011488 Year_82 7.9093 0.84975 9.3078 7.8681e-15 Number of observations: 94, Error degrees of freedom: 90 Root Mean Squared Error: 2.92 R-squared: 0.873, Adjusted R-Squared 0.868 F-statistic vs. constant model: 206, p-value = 3.83e-40
step to adjust the model to potentially include full quadratic terms.
mdl1 = step(mdl,'upper','quadratic')
1. Adding Weight^2, FStat = 9.9164, pValue = 0.0022303 mdl1 = Linear regression model: MPG ~ 1 + Weight + Year + Weight^2 Estimated Coefficients: Estimate SE tStat pValue __________ __________ _______ __________ (Intercept) 54.206 4.7117 11.505 2.6648e-19 Weight -0.016404 0.0031249 -5.2493 1.0283e-06 Year_76 2.0887 0.71491 2.9215 0.0044137 Year_82 8.1864 0.81531 10.041 2.6364e-16 Weight^2 1.5573e-06 4.9454e-07 3.149 0.0022303 Number of observations: 94, Error degrees of freedom: 89 Root Mean Squared Error: 2.78 R-squared: 0.885, Adjusted R-Squared 0.88 F-statistic vs. constant model: 172, p-value = 5.52e-41
Stepwise regression is a systematic method
for adding and removing terms from a linear or generalized linear
model based on their statistical significance in explaining the response
variable. The method begins with an initial model, specified using
and then compares the explanatory power of incrementally larger and
MATLAB® uses forward and backward stepwise regression to
determine a final model. At each step, the method searches for terms
to add to or remove from the model based on the value of the
The default value of
and in this case,
stepwiselm uses the p-value
of an F-statistic to test models with and without
a potential term at each step. If a term is not currently in the model,
the null hypothesis is that the term would have a zero coefficient
if added to the model. If there is sufficient evidence to reject the
null hypothesis, the term is added to the model. Conversely, if a
term is currently in the model, the null hypothesis is that the term
has a zero coefficient. If there is insufficient evidence to reject
the null hypothesis, the term is removed from the model.
Here is how stepwise proceeds when
Fit the initial model.
Examine a set of available terms not in the model. If any of these terms have p-values less than an entrance tolerance (that is, if it is unlikely that they would have zero coefficient if added to the model), add the one with the smallest p-value and repeat this step; otherwise, go to step 3.
If any of the available terms in the model have p-values greater than an exit tolerance (that is, the hypothesis of a zero coefficient cannot be rejected), remove the one with the largest p-value and go to step 2; otherwise, end.
At any stage, the function will not add a higher-order term
if the model does not also include all lower-order terms that are
subsets of it. For example, it will not try to add the term
X2^2 are already
in the model. Similarly, the function will not remove lower-order
terms that are subsets of higher-order terms that remain in the model.
For example, it will not examine to remove
in the model.
The default for
it follows a similar procedure for adding or removing terms.
There are several other criteria available, which you can specify
'Criterion' argument. You can use the
change in the value of the Akaike information criterion, Bayesian
information criterion, R-squared, adjusted R-squared as a criterion
to add or remove terms.
Depending on the terms included in the initial model and the order in which terms are moved in and out, the method might build different models from the same set of potential terms. The method terminates when no single step improves the model. There is no guarantee, however, that a different initial model or a different sequence of steps will not lead to a better fit. In this sense, stepwise models are locally optimal, but might not be globally optimal.
stepwiselm to select
a model from a starting model, continuing until no single step is