Questions about OOB error in TreeBagger

Question

Emmanuel on 14 May 2014

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/129549-questions-about-oob-error-in-treebagger

Answered: Emmanuel on 27 May 2014

Hello,

I'm currently working on a classification problem with random forests and am using Matlab's TreeBagger. To estimate the discriminant power of my features, I would like to visualize the prediction ratio for each class. So far I used a train and test set, and given that each forest gives a slightly different result due to its random nature, I build 100 forests and average the ratios.

However, on Breiman's site (<http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#ooberr>) it is stated :

"In random forests, there is no need for cross-validation or a separate test set to get an unbiased estimate of the test set error. It is estimated internally, during the run, as follows: Each tree is constructed using a different bootstrap sample from the original data. About one-third of the cases are left out of the bootstrap sample and not used in the construction of the kth tree."

I have seen papers using random forest for classification, where the authors still use train/test sets and cross-validation. I am confused : with random forests, how should the classification ratio be computed? With the "classic" method (train/test sets and cross-validation) or with the out-of-bag (OOB) estimations (according to what Breiman says)?

So I wanted to try the out-of-bag estimations. In the TreeBagger doc, I have seen that one can use the 'OOBPred' option and plot(oobError(b)) to visualize the classification error. My questions to this function are:

How can I visualize the OOB error for EACH class, and not only the general error?
As far as I understood, OOB estimations requires bagging ("About one-third of the cases are left out"). How does TreeBagger behave when I turn on the 'OOBPred' option while the 'FBoot' option is 1 (default value)? FBoot=1 means that there is no bagging right? ("Fraction of input data to sample with replacement from the input data for growing each new tree")

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Emmanuel on 27 May 2014

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/129549-questions-about-oob-error-in-treebagger#answer_138478

Hi,

I found in Breiman's Random Forests paper that:

"The study of error estimate for bagged classifiers in Breiman [1996b], gives empirical evidence to show that the OOB estimate is as accurate as using a test set of the same size as the training set."

So I suppose that people continue using train/test sets and cross-validation with RF when they have to compare the performance of their classification with other methods, where the test set and the train set are not of the same size.

But I still don't know how to compute the OOB error for each class with TreeBagger and how it behaves with FBoot=1. I could need some advice! ;-)

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Questions about OOB error in TreeBagger

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

Questions about OOB error in TreeBagger

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments