removeTerms

Class: LinearModel

Remove terms from linear model

Syntax

mdl1 = removeTerms(mdl,terms)

Description

mdl1 = removeTerms(mdl,terms) returns a linear model the same as mdl but with terms removed.

Input Arguments

mdl

Linear model, as constructed by fitlm or stepwiselm.

terms

Terms to remove from the mdl regression model. Specify as either a:

  • Text string representing one or more terms to remove. For details, see Wilkinson Notation.

  • Row or rows in the terms matrix (see modelspec in fitlm). For example, if there are three variables A, B, and C:

    [0 0 0] represents a constant term or intercept
    [0 1 0] represents B; equivalently, A^0 * B^1 * C^0
    [1 0 1] represents A*C
    [2 0 0] represents A^2
    [0 1 2] represents B*(C^2)

Output Arguments

mdl1

Linear model, the same as mdl but with terms removed. You can set mdl1 equal to mdl to overwrite mdl.

Definitions

Wilkinson Notation

Wilkinson notation describes the factors present in models. The notation relates to factors present in models, not to the multipliers (coefficients) of those factors.

Wilkinson NotationFactors in Standard Notation
1Constant (intercept) term
A^k, where k is a positive integerA, A2, ..., Ak
A + BA, B
A*BA, B, A*B
A:BA*B only
-BDo not include B
A*B + CA, B, C, A*B
A + B + C + A:BA, B, C, A*B
A*B*C - A:B:CA, B, C, A*B, A*C, B*C
A*(B + C)A, B, C, A*B, A*C

Statistics Toolbox™ notation always includes a constant term unless you explicitly remove the term using -1.

For details, see Wilkinson and Rogers [1].

Examples

expand all

Remove Terms from Model

Construct a default linear model of the Hald data. Remove terms with high p-values.

Load the data.

load hald
X = ingredients; % predictor variables
y = heat; % response

Fit a default linear model to the data.

mdl = fitlm(X,y)
mdl = 

Linear regression model:
    y ~ 1 + x1 + x2 + x3 + x4

Estimated Coefficients:
                   Estimate    SE         tStat       pValue  
    (Intercept)      62.405     70.071      0.8906     0.39913
    x1               1.5511    0.74477      2.0827    0.070822
    x2              0.51017    0.72379     0.70486      0.5009
    x3              0.10191    0.75471     0.13503     0.89592
    x4             -0.14406    0.70905    -0.20317     0.84407

Number of observations: 13, Error degrees of freedom: 8
Root Mean Squared Error: 2.45
R-squared: 0.982,  Adjusted R-Squared 0.974
F-statistic vs. constant model: 111, p-value = 4.76e-07

Remove the x3 and x4 terms because their p-values are so high.

terms = 'x3 + x4'; % terms to remove
mdl1 = removeTerms(mdl, terms)
mdl1 = 


Linear regression model:
    y ~ 1 + x1 + x2

Estimated Coefficients:
                   Estimate    SE          tStat     pValue    
    (Intercept)     52.577       2.2862    22.998    5.4566e-10
    x1              1.4683       0.1213    12.105    2.6922e-07
    x2             0.66225     0.045855    14.442     5.029e-08


Number of observations: 13, Error degrees of freedom: 10
Root Mean Squared Error: 2.41
R-squared: 0.979,  Adjusted R-Squared 0.974
F-statistic vs. constant model: 230, p-value = 4.41e-09

The new model has the same adjusted R-Squared value (0.974) as the previous model, meaning it is about as good a fit. All the terms in the new model have extremely low p-values.

References

[1] Wilkinson, G. N., and C. E. Rogers. Symbolic description of factorial models for analysis of variance. J. Royal Statistics Society 22, pp. 392–399, 1973.

Alternatives

Use stepwiselm to select a model from a starting model, continuing until no single step is beneficial.

Use addTerms to add particular terms.

Use step to optimally improve the model by adding or removing terms.

Was this topic helpful?