MATLAB Examples

Cross Validate a Regression Tree

This example shows how to examine the resubstitution and cross-validation accuracy of a regression tree for predicting mileage based on the carsmall data.

Load the carsmall data set. Consider acceleration, displacement, horsepower, and weight as predictors of MPG.

```load carsmall X = [Acceleration Displacement Horsepower Weight]; ```

Grow a regression tree using all of the observations.

```rtree = fitrtree(X,MPG); ```

Compute the in-sample error.

```resuberror = resubLoss(rtree) ```
```resuberror = 4.7188 ```

The resubstitution loss for a regression tree is the mean-squared error. The resulting value indicates that a typical predictive error for the tree is about the square root of 4.7, or a bit over 2.

Estimate the cross-validation MSE.

```rng 'default'; cvrtree = crossval(rtree); cvloss = kfoldLoss(cvrtree) ```
```cvloss = 23.8065 ```

The cross-validated loss is almost 25, meaning a typical predictive error for the tree on new data is about 5. This demonstrates that cross-validated loss is usually higher than simple resubstitution loss.