mdl1 = removeTerms(mdl,terms)
Terms to remove from the mdl regression model. Specify as either a:
Linear model, the same as mdl but with terms removed. You can set mdl1 equal to mdl to overwrite mdl.
Wilkinson notation describes the factors present in models. The notation relates to factors present in models, not to the multipliers (coefficients) of those factors.
|Wilkinson Notation||Factors in Standard Notation|
|1||Constant (intercept) term|
|A^k, where k is a positive integer||A, A2, ..., Ak|
|A + B||A, B|
|A*B||A, B, A*B|
|-B||Do not include B|
|A*B + C||A, B, C, A*B|
|A + B + C + A:B||A, B, C, A*B|
|A*B*C - A:B:C||A, B, C, A*B, A*C, B*C|
|A*(B + C)||A, B, C, A*B, A*C|
Statistics Toolbox™ notation always includes a constant term unless you explicitly remove the term using -1.
For details, see Wilkinson and Rogers .
Construct a default linear model of the Hald data. Remove terms with high p-values.
Load the data.
load hald X = ingredients; % predictor variables y = heat; % response
Fit a default linear model to the data.
mdl = fitlm(X,y)
mdl = Linear regression model: y ~ 1 + x1 + x2 + x3 + x4 Estimated Coefficients: Estimate SE tStat pValue (Intercept) 62.405 70.071 0.8906 0.39913 x1 1.5511 0.74477 2.0827 0.070822 x2 0.51017 0.72379 0.70486 0.5009 x3 0.10191 0.75471 0.13503 0.89592 x4 -0.14406 0.70905 -0.20317 0.84407 Number of observations: 13, Error degrees of freedom: 8 Root Mean Squared Error: 2.45 R-squared: 0.982, Adjusted R-Squared 0.974 F-statistic vs. constant model: 111, p-value = 4.76e-07
Remove the x3 and x4 terms because their p-values are so high.
terms = 'x3 + x4'; % terms to remove mdl1 = removeTerms(mdl, terms)
mdl1 = Linear regression model: y ~ 1 + x1 + x2 Estimated Coefficients: Estimate SE tStat pValue (Intercept) 52.577 2.2862 22.998 5.4566e-10 x1 1.4683 0.1213 12.105 2.6922e-07 x2 0.66225 0.045855 14.442 5.029e-08 Number of observations: 13, Error degrees of freedom: 10 Root Mean Squared Error: 2.41 R-squared: 0.979, Adjusted R-Squared 0.974 F-statistic vs. constant model: 230, p-value = 4.41e-09
The new model has the same adjusted R-Squared value (0.974) as the previous model, meaning it is about as good a fit. All the terms in the new model have extremely low p-values.
 Wilkinson, G. N., and C. E. Rogers. Symbolic description of factorial models for analysis of variance. J. Royal Statistics Society 22, pp. 392–399, 1973.
Use stepwiselm to select a model from a starting model, continuing until no single step is beneficial.
Use addTerms to add particular terms.
Use step to optimally improve the model by adding or removing terms.