`predictorImportance`

computes importance measures of the predictors in a tree by
summing changes in the node risk due to splits on every predictor, and then dividing the sum
by the total number of branch nodes. The change in the node risk is the difference between
the risk for the parent node and the total risk for the two children. For example, if a tree
splits a parent node (for example, node 1) into two child nodes (for example, nodes 2 and
3), then `predictorImportance`

increases the importance of the split predictor by

(*R*_{1} –
*R*_{2} –
*R*_{3})/*N*_{branch},

where *R*_{i} is the node risk of
node *i*, and *N*_{branch} is the
total number of branch nodes. A *node risk* is defined as a node error
or node impurity weighted by the node probability:

where *P*_{i} is the node
probability of node *i*, and *E*_{i}
is either the node error (for a tree grown by minimizing the twoing criterion) or node
impurity (for a tree grown by minimizing an impurity criterion, such as the Gini index or
deviance) of node *i*.

The estimates of predictor importance depend on whether you use surrogate splits for training.

If you use surrogate splits, `predictorImportance`

sums the changes
in the node risk over all splits at each branch node, including surrogate
splits. If you do not use surrogate splits, then the function takes the sum over
the best splits found at each branch node.

Estimates of predictor importance do not depend on the order of predictors if
you use surrogate splits, but do depend on the order if you do not use surrogate
splits.

If you use surrogate splits,
`predictorImportance`

computes
estimates before the tree is reduced by pruning (or
merging leaves). If you do not use surrogate splits,
`predictorImportance`

computes
estimates after the tree is reduced by pruning.
Therefore, pruning affects the predictor importance
for a tree grown without surrogate splits, and does
not affect the predictor importance for a tree grown
with surrogate splits.