What fraction of input data is used for out-of-bag observations when creating a TREEBAGGER object using Statistics Toolbox 7.1 (R2009a)?

2 views (last 30 days)
I have created a TREEBAGGER object setting 'oobvarimp' to 'on'. I want to determine what fraction of observations are used as out-of-bag observations.

Accepted Answer

MathWorks Support Team
MathWorks Support Team on 27 Aug 2009
For every tree, the bagger randomly selects N*bagger.FBoot out of N observations with replacement (default) for training. Observations that were not selected for training are out-of-bag observations. If bagger.FBoot=1 (default), on an average roughly 2/3 of input data is selected for training for every tree and the remaining 1/3 is used as out-of-bag observations. This number can fluctuate from one tree to another, and out-of-bag observations for one tree are not identical to out-of-bag observations for another tree.
You can use the following code as an example to determine the fraction of out-of-bag observations per tree.
load imports-85;
Y = X(:,1);
X = X(:,2:end);
ntrees = 50;
for j = [0.5 0.8 1]
b = TreeBagger(ntrees,X,Y,'oobvarimp','on','Fboot',j);
[obs vars] = size(b.X);
num_oob_per_tree = sum(sum(b.OOBIndices))/ntrees;
fprintf(['\n\nFor ' num2str(ntrees) ' trees and FBoot = ' num2str(j) ':\n'])
frac_oob_observations = num_oob_per_tree/obs
end

More Answers (0)

Products


Release

R2009a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!