Documentation

This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English verison of the page.

oobError

Class: TreeBagger

Out-of-bag error

Syntax

```err = oobError(B) err = oobError(B,'param1',val1,'param2',val2,...) ```

Description

`err = oobError(B)` computes the misclassification probability (for classification trees) or mean squared error (for regression trees) for out-of-bag observations in the training data, using the trained bagger `B`. `err` is a vector of length `NTrees`, where `NTrees` is the number of trees in the ensemble.

`err = oobError(B,'param1',val1,'param2',val2,...) ` specifies optional parameter name/value pairs:

 `'Mode'` Character vector indicating how `oobError` computes errors. If set to `'cumulative'` (default), the method computes cumulative errors and `err` is a vector of length `NTrees`, where the first element gives error from `trees(1)`, second element gives error from `trees(1:2)` etc., up to `trees(1:NTrees)`. If set to `'individual'`, `err` is a vector of length `NTrees`, where each element is an error from each tree in the ensemble. If set to `'ensemble'`, `err` is a scalar showing the cumulative error for the entire ensemble. `'Trees'` Vector of indices indicating what trees to include in this calculation. By default, this argument is set to `'all'` and the method uses all trees. If `'Trees'` is a numeric vector, the method returns a vector of length `NTrees` for `'cumulative'` and `'individual'` modes, where `NTrees` is the number of elements in the input vector, and a scalar for `'ensemble'` mode. For example, in the `'cumulative'` mode, the first element gives error from `trees(1)`, the second element gives error from `trees(1:2)` etc. `'TreeWeights'` Vector of tree weights. This vector must have the same length as the `'Trees'` vector. `oobError` uses these weights to combine output from the specified trees by taking a weighted average instead of the simple nonweighted majority vote. You cannot use this argument in the `'individual'` mode.

Algorithms

`oobError` estimates the weighted ensemble error for out-of-bag observations. That is, `oobError` applies `error` to the training data stored in the input `TreeBagger` model `B`, and selects the out-of-bag observations for each tree to compose the ensemble error.

• `B.X` and `B.Y` are the training data predictors and responses, respectively.

• `B.OOBIndices` specifies which observations are out-of-bag for each tree in the ensemble.

• `B.W` specifies the observation weights.

• Optionally:

• Using the `'Mode'` name-value pair argument, you can specify to return the individual, weighted ensemble error for each tree, or the entire, weighted ensemble error. By default, `oobError` returns the cumulative, weighted ensemble error.

• Using the `'Trees'` name-value pair argument, you can choose which trees to use in the ensemble error calculations.

• Using the `'TreeWeights'` name-value pair argument, you can attribute each tree with a weight.

`oobError` applies the algorithms described below. For more details, see `error` and `predict`.

For regression problems, `oobError` returns the weighted MSE.

1. `oobError` predicts responses for all out-of-bag observations.

2. The MSE estimate depends on the value of `'Mode'`.

• If you specify `'Mode','Individual'`, then `oobError` sets any in bag observations within a selected tree to the weighted sample average of the observed, training data responses. Then, `oobError` computes the weighted MSE for each selected tree.

• If you specify `'Mode','Cumulative'`, then `ooError` returns a vector of cumulative, weighted MSEs, where MSEt is the cumulative, weighted MSE for selected tree t. To compute MSEt, for each observation that is out of bag for at least one tree through tree t, `oobError` computes the cumulative, weighted mean of the predicted responses through tree t. `oobError` sets observations that are in bag for all selected trees through tree t to the weighted sample average of the observed, training data responses. Then, `oobError` computes MSEt.

• If you specify `'Mode','Ensemble'`, then, for each observation that is out of bag for at least one tree, `oobError` computes the weighted mean over all selected trees. `oobError` sets observations that are in bag for all selected trees to the weighted sample average of the observed, training data responses. Then, `oobError` computes the weighted MSE, which is the same as the final, cumulative, weighted MSE.

In classification problems, `oobError` returns the weighted misclassification rate.

1. `oobError` predicts classes for all out-of-bag observations.

2. The weighted misclassification rate estimate depends on the value of `'Mode'`.

• If you specify `'Mode','Individual'`, then `oobError` sets any in bag observations within a selected tree to the predicted, weighted, most popular class over all training responses. If there are multiple most popular classes, `error` considers the one listed first in the `ClassNames` property of the `TreeBagger` model the most popular. Then, `oobError` computes the weighted misclassification rate for each selected tree.

• If you specify `'Mode','Cumulative'`, then `ooError` returns a vector of cumulative, weighted misclassification rates, where et* is the cumulative, weighted misclassification rate for selected tree t. To compute et*, for each observation that is out of bag for at least one tree through tree t, `oobError` finds the predicted, cumulative, weighted most popular class through tree t. `oobError` sets observations that are in bag for all selected trees through tree t to the weighted, most popular class over all training responses. If there are multiple most popular classes, `error` considers the one listed first in the `ClassNames` property of the `TreeBagger` model the most popular. Then, `oobError` computes et*.

• If you specify `'Mode','Ensemble'`, then, for each observation that is out of bag for at least one tree, `oobError` computes the weighted, most popular class over all selected trees. `oobError` sets observations that are in bag for all selected trees through tree t to the predicted, weighted, most popular class over all training responses. If there are multiple most popular classes, `error` considers the one listed first in the `ClassNames` property of the `TreeBagger` model the most popular. Then, `oobError` computes the weighted misclassification rate , which is the same as the final, cumulative, weighted misclassification rate.