Predict out-of-bag response of ensemble
[label,score] = oobPredict(ens,Name,Value)
A classification bagged ensemble, constructed with
comma-separated pairs of
the argument name and
Value is the corresponding value.
Name must appear inside quotes. You can specify several name and value
pair arguments in any order as
Indices of weak learners in the ensemble ranging from
Classification labels of the same data type as the training data
Find the out-of-bag predictions and scores for the Fisher iris data. Find the scores with notable uncertainty in the resulting classifications.
Load the sample data set.
Train an ensemble of bagged classification trees.
ens = fitcensemble(meas,species,'Method','Bag');
Find the out-of-bag predictions and scores.
[label,score] = oobPredict(ens);
Find the scores in the range
(0.2,0.8). These scores have notable uncertainty in the resulting classifications.
unsure = ((score > .2) & (score < .8)); sum(sum(unsure)) % Number of uncertain predictions
ans = 10
Bagging, which stands for “bootstrap aggregation”, is a
type of ensemble learning. To bag a weak learner such as a decision tree on a dataset,
fitrensemble generates many bootstrap
replicas of the dataset and grows decision trees on these replicas.
fitrensemble obtains each bootstrap replica by randomly selecting
N observations out of
N with replacement, where
N is the dataset size. To find the predicted response of a trained
predict takes an average over predictions from
N out of
with replacement omits on average 37% (1/e) of
observations for each decision tree. These are "out-of-bag" observations.
For each observation,
oobLoss estimates the out-of-bag
prediction by averaging over predictions from all trees in the ensemble
for which this observation is out of bag. It then compares the computed
prediction against the true response for this observation. It calculates
the out-of-bag error by comparing the out-of-bag predicted responses
against the true responses for all observations used for training.
This out-of-bag average is an unbiased estimator of the true ensemble
For ensembles, a classification score represents the confidence of a classification into a class. The higher the score, the higher the confidence.
Different ensemble algorithms have different definitions for their scores. Furthermore, the range of scores depends on ensemble type. For example:
AdaBoostM1 scores range from –∞
Bag scores range from