Is stepwisefit the exact implementation of stepwise regression?

2 views (last 30 days)
Hi all. I looked into the stepwisefit function and got a doubt that this function may not be the exact implementation of stepwise regression.
Each round of the stepwise regression consists of a forward check (see if a term should be added) and a backward check(see if a term should be removed). That is a forward check in a new round wouldn't be made before the backward check in the last round. After a term is added to the model, a backward check follows immediately. But the stepwisefit function makes a backward check only when there is no term to be added (i.e., after a term is added to the model, a forward check is made immediately. A second term will be added to the model if it satisfies the adding condition. The algorithm continues the forward check until there is no term to be added and only then the backward check will start) . Although in most cases it gives the same result, I don't think it is the exact implementation of the stepwise regression algorithm.
Can anyone tell me if my understanding is correct or not? Thanks!

Answers (1)

John D'Errico
John D'Errico on 13 Sep 2016
Edited: John D'Errico on 13 Sep 2016
You call it "THE stepwise regression algorithm" as if there is only one possible way to implement any algorithm, and one possible sequence of tests and steps, as if it were handed down on a tablet, set in stone.
From the doc for stepwise:
"Stepwise regression is a systematic method for adding and removing terms from a multilinear model based on their statistical significance in a regression. The method begins with an initial model and then compares the explanatory power of incrementally larger and smaller models. At each step, the p value of an F-statistic is computed to test models with and without a potential term. If a term is not currently in the model, the null hypothesis is that the term would have a zero coefficient if added to the model. If there is sufficient evidence to reject the null hypothesis, the term is added to the model. Conversely, if a term is currently in the model, the null hypothesis is that the term has a zero coefficient. If there is insufficient evidence to reject the null hypothesis, the term is removed from the model. The method proceeds as follows:
Fit the initial model.
If any terms not in the model have p-values less than an entrance tolerance (that is, if it is unlikely that they would have zero coefficient if added to the model), add the one with the smallest p value and repeat this step; otherwise, go to step 3.
If any terms in the model have p-values greater than an exit tolerance (that is, if it is unlikely that the hypothesis of a zero coefficient can be rejected), remove the one with the largest p value and go to step 2; otherwise, end."
If you read that doc, it clearly tells you that terms are not deleted until after no terms are deemed appropriate to be added.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!