MATLAB Examples

Assess Model Assumptions Using Residuals

This example shows how to assess the model assumptions by examining the residuals of a fitted linear regression model.

Load the sample data and store the independent and response variables in a table.

 load imports-85
 tbl = table(X(:,7),X(:,8),X(:,9),X(:,15),'VariableNames',...
{'curb_weight','engine_size','bore','price'});

Fit a linear regression model.

mdl = fitlm(tbl)
mdl = 


Linear regression model:
    price ~ 1 + curb_weight + engine_size + bore

Estimated Coefficients:
                    Estimate        SE         tStat       pValue  
                   __________    _________    _______    __________

    (Intercept)        64.095        3.703     17.309    2.0481e-41
    curb_weight    -0.0086681    0.0011025    -7.8623      2.42e-13
    engine_size     -0.015806     0.013255    -1.1925       0.23452
    bore              -2.6998       1.3489    -2.0015      0.046711


Number of observations: 201, Error degrees of freedom: 197
Root Mean Squared Error: 3.95
R-squared: 0.674,  Adjusted R-Squared 0.669
F-statistic vs. constant model: 136, p-value = 1.14e-47

Plot the histogram of raw residuals.

plotResiduals(mdl)

The histogram shows that the residuals are slightly right skewed.

Plot the box plot of all four types of residuals.

 Res = table2array(mdl.Residuals);

You can see the right-skewed structure of the residuals in the box plot as well.

Plot the normal probability plot of the raw residuals.

plotResiduals(mdl,'probability')
boxplot(Res)

This normal probability plot also shows the deviation from normality and the skewness on the right tail of the distribution of residuals.

Plot the residuals versus lagged residuals.

plotResiduals(mdl,'lagged')

This graph shows a trend, which indicates a possible correlation among the residuals. You can further check this using dwtest(mdl). Serial correlation among residuals usually means that the model can be improved.

Plot the symmetry plot of residuals.

plotResiduals(mdl,'symmetry')

This plot also suggests that the residuals are not distributed equally around their median, as would be expected for normal distribution.

Plot the residuals versus the fitted values.

plotResiduals(mdl,'fitted')

The increase in the variance as the fitted values increase suggests possible heteroscedasticity.