# predictorImportance

Class: CompactRegressionTree

Estimates of predictor importance

## Syntax

`imp = predictorImportance(tree)`

## Description

`imp = predictorImportance(tree)` computes estimates of predictor importance for `tree` by summing changes in the mean squared error due to splits on every predictor and dividing the sum by the number of branch nodes.

## Input Arguments

 `tree` A regression tree created by `fitrtree`, or by the `compact` method.

## Output Arguments

 `imp` A row vector with the same number of elements as the number of predictors (columns) in `tree``.X`. The entries are the estimates of predictor importance, with `0` representing the smallest possible importance.

## Definitions

### Predictor Importance

`predictorImportance` computes estimates of predictor importance for `tree` by summing changes in the mean squared error (MSE) due to splits on every predictor and dividing the sum by the number of branch nodes. If the tree is grown without surrogate splits, this sum is taken over best splits found at each branch node. If the tree is grown with surrogate splits, this sum is taken over all splits at each branch node including surrogate splits. `imp` has one element for each input predictor in the data used to train this tree. At each node, MSE is estimated as node error weighted by the node probability. Variable importance associated with this split is computed as the difference between MSE for the parent node and the total MSE for the two children.

Estimates of predictor importance do not depend on the order of predictors if you use surrogate splits, but do depend on the order if you do not use surrogate splits.

If you use surrogate splits, `predictorImportance` computes estimates before the tree is reduced by pruning or merging leaves. If you do not use surrogate splits, `predictorImportance` computes estimates after the tree is reduced by pruning or merging leaves. Therefore, reducing the tree by pruning affects the predictor importance for a tree grown without surrogate splits, and does not affect the predictor importance for a tree grown with surrogate splits.

## Examples

Find predictor importance for the `carsmall` data. Use just the numeric predictors:

```load carsmall X = [Acceleration Cylinders Displacement ... Horsepower Model_Year Weight]; tree = fitrtree(X,MPG); imp = predictorImportance(tree) imp = 0.0315 0 0.1082 0.0686 0.1629 1.2924```

The weight (last predictor) has the most impact on mileage (MPG). The second predictor has importance 0; this means the number of cylinders has no impact on predictions made with `tree`.

Estimate the predictor importance for all variables in the `carsmall` data for a tree grown with surrogate splits:

```load carsmall X = [Acceleration Cylinders Displacement ... Horsepower Model_Year Weight]; tree2 = fitrtree(X,MPG,... 'Surrogate','on'); imp2 = predictorImportance(tree2) imp2 = 0.5287 1.1977 1.2400 0.7059 1.0677 1.4106```

While weight (last predictor) still has the most impact on mileage (MPG), this estimate has the second predictor (number of cylinders) as the third most important predictor.