b = stepwisefit(X,y)
[b,se,pval,inmodel,stats,nextstep,history] = stepwisefit(...)
[...] = stepwisefit(X,y,
b = stepwisefit(X,y) uses a stepwise method
to perform a multilinear regression of the response values in the n-by-1
y on the p predictive
terms in the n-by-p matrix
Distinct predictive terms should appear in different columns of
b is a p-by-1 vector of
estimated coefficients for all of the terms in
stepwisefit function calculates the coefficient
estimate values in
b as follows:
If a term is not in the final model, then the corresponding
coefficient estimate in
b results from adding only
that term to the predictors in the final model.
If a term is in the final model, then the coefficient
b for that term is a result of the
final model, that is
stepwise does not consider
the terms it excluded from the model while computing these values.
stepwisefit automatically includes a constant
term in all models. Do not enter a column of 1s directly into
y as missing
values, and ignores them.
[b,se,pval,inmodel,stats,nextstep,history] = stepwisefit(...) returns
the following additional information:
se — A vector of standard
pval — A vector of p-values
for testing whether elements of
b are 0
inmodel — A logical vector,
with length equal to the number of columns in
specifying which terms are in the final model
stats — A structure of additional
statistics with the following fields. All statistics pertain to the
final model except where noted.
source — The character vector
dfe — Degrees of freedom
df0 — Degrees of freedom
for the regression
SStotal — Total sum of squares
of the response
SSresid — Sum of squares
of the residuals
fstat — F-statistic
for testing the final model vs. no model (mean only)
pval — p value
of the F-statistic
rmse — Root mean square
xr — Residuals for predictors
not in the final model, after removing the part of them explained
by predictors in the model
yr — Residuals for the response
using predictors in the final model
B — Coefficients for terms
in final model, with values for a term not in the model set to the
value that would be obtained by adding that term to the model
SE — Standard errors for
TSTAT — t statistics
for coefficient estimates
PVAL — p-values
for coefficient estimates
intercept — Estimated intercept
wasnan — Indicates which
rows in the data contained
nextstep — The recommended
next step—either the index of the next term to move in or out
of the model, or
0 if no further steps are recommended
history — Structure containing
information on steps taken, with the following fields:
B — Matrix of regression
coefficients, where each column is one step, and each row is one coefficient.
rmse — Root mean square
errors for the model at each step.
df0 — Degrees of freedom
for the regression at each step.
in — Logical array indicating
which predictors are in the model at each step, where each row is
one step, and each column is one predictor.
[...] = stepwisefit(X,y, specifies
one or more of the name/value pairs described in the following table.
A logical vector specifying terms to include in the initial fit. The default is to specify no terms.
The maximum p value for a term to
be added. The default is
The minimum p value for a term to
be removed. The default is the maximum of the value of
The maximum number of steps in the regression. The default
A logical vector specifying terms to keep in their initial state. The default is to specify no terms.
Load the data in
hald.mat, which contains
observations of the heat of reaction of various cement mixtures:
load hald whos Name Size Bytes Class Attributes Description 22x58 2552 char hald 13x5 520 double heat 13x1 104 double ingredients 13x4 416 double
The response (
heat) depends on the quantities
of the four predictors (the columns of
stepwisefit to carry out the stepwise
regression algorithm, beginning with no terms in the model and using
entrance/exit tolerances of 0.05/0.10 on the p-values:
stepwisefit(ingredients,heat,... 'penter',0.05,'premove',0.10); Initial columns included: none Step 1, added column 4, p=0.000576232 Step 2, added column 1, p=1.10528e-006 Final columns included: 1 4 'Coeff' 'Std.Err.' 'Status' 'P' [ 1.4400] [ 0.1384] 'In' [1.1053e-006] [ 0.4161] [ 0.1856] 'Out' [ 0.0517] [-0.4100] [ 0.1992] 'Out' [ 0.0697] [-0.6140] [ 0.0486] 'In' [1.8149e-007]
stepwisefit automatically includes an intercept
term in the model, so you do not add it explicitly to
you would for
regress. For terms
not in the model, coefficient estimates and their standard errors
are those that result by adding the corresponding term to the final
inmodel parameter is used to specify
terms in an initial model:
initialModel = ... [false true false false]; % Force in 2nd term stepwisefit(ingredients,heat,... 'inmodel',initialModel,... 'penter',.05,'premove',0.10); Initial columns included: 2 Step 1, added column 1, p=2.69221e-007 Final columns included: 1 2 'Coeff' 'Std.Err.' 'Status' 'P' [ 1.4683] [ 0.1213] 'In' [2.6922e-007] [ 0.6623] [ 0.0459] 'In' [5.0290e-008] [ 0.2500] [ 0.1847] 'Out' [ 0.2089] [-0.2365] [ 0.1733] 'Out' [ 0.2054]
The preceding two models, built from different initial models, use different subsets of the predictive terms. Terms 2 and 4, swapped in the two models, are highly correlated:
term2 = ingredients(:,2); term4 = ingredients(:,4); R = corrcoef(term2,term4) R = 1.0000 -0.9730 -0.9730 1.0000
To compare the models, use the
[betahat1,se1,pval1,inmodel1,stats1] = ... stepwisefit(ingredients,heat,... 'penter',.05,'premove',0.10,... 'display','off'); [betahat2,se2,pval2,inmodel2,stats2] = ... stepwisefit(ingredients,heat,... 'inmodel',initialModel,... 'penter',.05,'premove',0.10,... 'display','off'); RMSE1 = stats1.rmse RMSE1 = 2.7343 RMSE2 = stats2.rmse RMSE2 = 2.4063
The second model has a lower Root Mean Square Error (RMSE).
Stepwise regression is a systematic method for adding and removing terms from a multilinear model based on their statistical significance in a regression. The method begins with an initial model and then compares the explanatory power of incrementally larger and smaller models. At each step, the p value of an F-statistic is computed to test models with and without a potential term. If a term is not currently in the model, the null hypothesis is that the term would have a zero coefficient if added to the model. If there is sufficient evidence to reject the null hypothesis, the term is added to the model. Conversely, if a term is currently in the model, the null hypothesis is that the term has a zero coefficient. If there is insufficient evidence to reject the null hypothesis, the term is removed from the model. The method proceeds as follows:
Fit the initial model.
If any terms not in the model have p-values less than an entrance tolerance (that is, if it is unlikely that they would have zero coefficient if added to the model), add the one with the smallest p value and repeat this step; otherwise, go to step 3.
If any terms in the model have p-values greater than an exit tolerance (that is, if it is unlikely that the hypothesis of a zero coefficient can be rejected), remove the one with the largest p value and go to step 2; otherwise, end.
Depending on the terms included in the initial model and the order in which terms are moved in and out, the method may build different models from the same set of potential terms. The method terminates when no single step improves the model. There is no guarantee, however, that a different initial model or a different sequence of steps will not lead to a better fit. In this sense, stepwise models are locally optimal, but may not be globally optimal.
 Draper, N. R., and H. Smith. Applied Regression Analysis. Hoboken, NJ: Wiley-Interscience, 1998. pp. 307–312.