mdl1 = removeTerms(mdl,terms)
mdl— Full, fitted linear regression model
terms— Terms to remove from regression modelformula string | matrix
Terms to remove from the regression model
specified as one of the following:
Formula string representing one or more terms to remove. For details, see Wilkinson Notation.
Row or rows in the terms matrix (see the
description in the fitting function
For example, if there are three variables
[0 0 0] represents a constant term or intercept [0 1 0] represents B; equivalently, A^0 * B^1 * C^0 [1 0 1] represents A*C [2 0 0] represents A^2 [0 1 2] represents B*(C^2)
Wilkinson notation describes the factors present in models. The notation relates to factors present in models, not to the multipliers (coefficients) of those factors.
|Wilkinson Notation||Factors in Standard Notation|
|Constant (intercept) term|
|Do not include |
Statistics and Machine Learning Toolbox™ notation always includes a constant term
unless you explicitly remove the term using
For details, see Wilkinson and Rogers .
Construct a default linear model of the Hald data. Remove terms with high -values.
Load the data.
load hald X = ingredients; % predictor variables y = heat; % response
Fit a default linear model to the data.
mdl = fitlm(X,y)
mdl = Linear regression model: y ~ 1 + x1 + x2 + x3 + x4 Estimated Coefficients: Estimate SE tStat pValue ________ _______ ________ ________ (Intercept) 62.405 70.071 0.8906 0.39913 x1 1.5511 0.74477 2.0827 0.070822 x2 0.51017 0.72379 0.70486 0.5009 x3 0.10191 0.75471 0.13503 0.89592 x4 -0.14406 0.70905 -0.20317 0.84407 Number of observations: 13, Error degrees of freedom: 8 Root Mean Squared Error: 2.45 R-squared: 0.982, Adjusted R-Squared 0.974 F-statistic vs. constant model: 111, p-value = 4.76e-07
x4 terms because their
-values are so high.
terms = 'x3 + x4'; % terms to remove mdl1 = removeTerms(mdl, terms)
mdl1 = Linear regression model: y ~ 1 + x1 + x2 Estimated Coefficients: Estimate SE tStat pValue ________ ________ ______ __________ (Intercept) 52.577 2.2862 22.998 5.4566e-10 x1 1.4683 0.1213 12.105 2.6922e-07 x2 0.66225 0.045855 14.442 5.029e-08 Number of observations: 13, Error degrees of freedom: 10 Root Mean Squared Error: 2.41 R-squared: 0.979, Adjusted R-Squared 0.974 F-statistic vs. constant model: 230, p-value = 4.41e-09
The new model has the same adjusted R-Squared value (0.974) as the previous model, meaning it is about as good a fit. All the terms in the new model have extremely low -values.
 Wilkinson, G. N., and C. E. Rogers. Symbolic description of factorial models for analysis of variance. J. Royal Statistics Society 22, pp. 392–399, 1973.
stepwiselm to select
a model from a starting model, continuing until no single step is
add particular terms.
optimally improve the model by adding or removing terms.