|
On 8/15/2012 6:58 PM, Evan Ruzanski wrote:
> Hello,
>
> I'm building an ensemble of regression trees using "TreeBagger" followed
> by "predict" for time series forecasting using a set of predictor
> variable time series. I have two related questions:
>
> 1. Is it possible, and if so, what is the method, to extract the outputs
> from the individual ensemble members of the bag when using TreeBagger? I
> would like to see the performance of each member and ascertain the
> spread (and other statistics) of the ensemble, for both training and
> forecasting if possible.
Did you try typing 'help TreeBagger'? It gives you a lot of useful info.
The TreeBagger object has Trees property, a cell array of classregtree
objects. These are individual trees.
Alternatively, you can pass 'trees' parameter to predict, oobPredict and
other methods. If you pass a scalar index, you will get predictions from
that individual tree. Be careful though - If you pass 'trees'=5 (say) to
oobPredict these predictions are only valid for out-of-bag observations
for tree number 5, that is, observations with true values in
OOBIndices(:,5). You can set DefaultYfit to NaN to detect in-bag
observations easily in output from oobPredict.
Help for TreeBagger/predict tells you how to get the standard deviation
over the trees.
>
> 2. Is it possible, and if so, what is the method, to ascertain which
> predictors are being randomly sampled to create each tree?
>
> I see the "OOB..." parameter settings on the TreeBagger documentation
> page but I'm unclear how to potentially use these for my intended purposes.
>
> Many thanks!
>
Predictors are randomly sampled for every split, not once per tree.
After you get an individual tree from the Trees array, you can run
varimportance method on it. It will tell you the change in the split
criterion (MSE for regression) summed over all splits on a specific
predictor in this tree. If it is zero, the predictor has not been used.
-Ilya
|