imp = predictorImportance(tree)
computes
estimates of predictor importance for imp
= predictorImportance(tree
)tree
by summing
changes in the mean squared error due to splits on every predictor
and dividing the sum by the number of branch nodes.

A row vector with the same number of elements as the number
of predictors (columns) in 
predictorImportance
computes estimates of predictor
importance for tree
by summing changes in the
mean squared error (MSE) due to splits on every predictor and dividing
the sum by the number of branch nodes. If the tree is grown without
surrogate splits, this sum is taken over best splits found at each
branch node. If the tree is grown with surrogate splits, this sum
is taken over all splits at each branch node including surrogate splits. imp
has
one element for each input predictor in the data used to train this
tree. At each node, MSE is estimated as node error weighted by the
node probability. Variable importance associated with this split is
computed as the difference between MSE for the parent node and the
total MSE for the two children.
Estimates of predictor importance do not depend on the order of predictors if you use surrogate splits, but do depend on the order if you do not use surrogate splits.
If you use surrogate splits, predictorImportance
computes
estimates before the tree is reduced by pruning or merging leaves.
If you do not use surrogate splits, predictorImportance
computes
estimates after the tree is reduced by pruning or merging leaves.
Therefore, reducing the tree by pruning affects the predictor importance
for a tree grown without surrogate splits, and does not affect the
predictor importance for a tree grown with surrogate splits.
Find predictor importance for the carsmall
data.
Use just the numeric predictors:
load carsmall X = [Acceleration Cylinders Displacement ... Horsepower Model_Year Weight]; tree = fitrtree(X,MPG); imp = predictorImportance(tree) imp = 0.0315 0 0.1082 0.0686 0.1629 1.2924
The weight (last predictor) has the most impact on mileage (MPG).
The second predictor has importance 0; this means the number of cylinders
has no impact on predictions made with tree
.
Estimate the predictor importance for all variables in the carsmall
data
for a tree grown with surrogate splits:
load carsmall X = [Acceleration Cylinders Displacement ... Horsepower Model_Year Weight]; tree2 = fitrtree(X,MPG,... 'Surrogate','on'); imp2 = predictorImportance(tree2) imp2 = 0.5287 1.1977 1.2400 0.7059 1.0677 1.4106
While weight (last predictor) still has the most impact on mileage (MPG), this estimate has the second predictor (number of cylinders) as the third most important predictor.