CompactTreeBagger class

Compact ensemble of decision trees grown by bootstrap aggregation

Description

CompactTreeBagger class is a lightweight class that contains the trees grown using TreeBagger. CompactTreeBagger does not preserve any information about how TreeBagger grew the decision trees. It does not contain the input data used for growing trees, nor does it contain training parameters such as minimal leaf size or number of variables sampled for each decision split at random. You can only use CompactTreeBagger for predicting the response of the trained ensemble given new data X, and other related functions.

CompactTreeBagger lets you save the trained ensemble to disk, or use it in any other way, while discarding training data and various parameters of the training configuration irrelevant for predicting response of the fully grown ensemble. This reduces storage and memory requirements, especially for ensembles trained on large data sets.

Construction

CompactTreeBaggerCreate CompactTreeBagger object

Methods

combineCombine two ensembles
errorError (misclassification probability or MSE)
marginClassification margin
mdsProxMultidimensional scaling of proximity matrix
meanMarginMean classification margin
outlierMeasureOutlier measure for data
predictPredict response
proximityProximity matrix for data
setDefaultYfitSet default value for predict

Properties

ClassNames

The ClassNames property is a cell array containing the class names for the response variable Y supplied to TreeBagger. This property is empty for regression trees.

DeltaCritDecisionSplit

The DeltaCritDecisionSplit property is a numeric array of size 1-by-Nvars of changes in the split criterion summed over splits on each variable, averaged across the entire ensemble of grown trees.

See also TreeBagger.DeltaCritDecisionSplit, ClassificationTree.predictorImportance, and RegressionTree.predictorImportance

DefaultYfit

The DefaultYfit property controls what predicted value CompactTreeBagger returns when no prediction is possible, for example when the predict method needs to predict for an observation which has only false values in the matrix supplied through 'useifort' argument.

For classification, you can set this property to either '' or 'MostPopular'. If you choose 'MostPopular' (default), the property value becomes the name of the most probable class in the training data.

For regression, you can set this property to any numeric scalar. The default is the mean of the response for the training data.

See also predict, setDefaultYfit, TreeBagger.DefaultYfit.

Method

The Method property is 'classification' for classification ensembles and 'regression' for regression ensembles.

NTrees

The NTrees property is a scalar equal to the number of decision trees in the ensemble.

NVarSplit

The NVarSplit property is a numeric array of size 1-by-Nvars, where every element gives a number of splits on this predictor summed over all trees.

Trees

The Trees property is a cell array of size NTrees-by-1 containing the trees in the ensemble.

VarAssoc

The VarAssoc property is a matrix of size Nvars-by-Nvars with predictive measures of variable association, averaged across the entire ensemble of grown trees. If you grew the ensemble setting 'surrogate' to 'on', this matrix for each tree is filled with predictive measures of association averaged over the surrogate splits. If you grew the ensemble setting 'surrogate' to 'off' (default), VarAssoc is diagonal.

See also ClassificationTree.surrogateAssociation, RegressionTree.surrogateAssociation.

VarNames

The VarNames property is a cell array containing the names of the predictor variables (features). These names are taken from the optional 'names' parameter that supplied to TreeBagger. The default names are 'x1', 'x2', etc.

Copy Semantics

Value. To learn how this affects your use of the class, see Comparing Handle and Value Classes in the MATLAB® Object-Oriented Programming documentation.

Was this topic helpful?