Accelerating the pace of engineering and science

# Documentation Center

• Trial Software

# meanSurrVarAssoc

Mean predictive measure of association for surrogate splits in decision tree

## Syntax

ma = meanSurrVarAssoc(tree)
ma = meanSurrVarAssoc(tree,N)

## Description

ma = meanSurrVarAssoc(tree) returns a matrix of predictive measures of association for the predictors in tree.

ma = meanSurrVarAssoc(tree,N) returns a matrix of predictive measures of association averaged over the nodes in vector N.

## Input Arguments

 tree A classification tree constructed with fitctree, or a compact regression tree constructed with compact. N Vector of node numbers in tree.

## Output Arguments

 ma ma = meanSurrVarAssoc(tree) returns a P-by-P matrix, where P is the number of predictors in tree. ma(i,j) is the predictive measure of association between the optimal split on variable i and a surrogate split on variable j. See Predictive Measure of Association.ma = meanSurrVarAssoc(tree,N) returns a P-by-P representing the predictive measure of association between variables averaged over nodes in the vector N. N contains node numbers from 1 to max(tree.NumNodes).

## Definitions

### Predictive Measure of Association

The predictive measure of association between the optimal split on variable i and a surrogate split on variable j is:

Here

• PL and PR are the node probabilities for the optimal split of node i into Left and Right nodes respectively.

• is the probability that both (optimal) node i and (surrogate) node j send an observation to the Left.

• is the probability that both (optimal) node i and (surrogate) node j send an observation to the Right.

Clearly, λi,j lies from –∞ to 1. Variable j is a worthwhile surrogate split for variable i if λi,j>0.

Element ma(i,j) is the predictive measure of association averaged over surrogate splits on predictor j for which predictor i is the optimal split predictor. This average is computed by summing positive values of the predictive measure of association over optimal splits on predictor i and surrogate splits on predictor j and dividing by the total number of optimal splits on predictor i, including splits for which the predictive measure of association between predictors i and j is negative.

## Examples

Find the mean predictive measure of association between the variables in the Fisher iris data:

```load fisheriris
obj = fitctree(meas,species,'surrogate','on');
msva = meanSurrVarAssoc(obj)

msva =
1.0000         0         0         0
0    1.0000         0         0
0.4633    0.2500    1.0000    0.5000
0.2065    0.1413    0.4022    1.0000```

Find the mean predictive measure of association averaged over the odd-numbered nodes in obj:

```N = 1:2:obj.NumNodes;
msva = meanSurrVarAssoc(obj,N)

msva =
1.0000         0         0         0
0    1.0000         0         0
0.7600    0.5000    1.0000    1.0000
0.4130    0.2826    0.8043    1.0000```