Documentation Center

  • Trial Software
  • Product Updates

meanSurrVarAssoc

Class: CompactClassificationTree

Mean predictive measure of association for surrogate splits in decision tree

Syntax

ma = meanSurrVarAssoc(tree)
ma = meanSurrVarAssoc(tree,N)

Description

ma = meanSurrVarAssoc(tree) returns a matrix of predictive measures of association for the predictors in tree.

ma = meanSurrVarAssoc(tree,N) returns a matrix of predictive measures of association averaged over the nodes in vector N.

Input Arguments

tree

A classification tree constructed with fitctree, or a compact regression tree constructed with compact.

N

Vector of node numbers in tree.

Output Arguments

ma

  • ma = meanSurrVarAssoc(tree) returns a P-by-P matrix, where P is the number of predictors in tree. ma(i,j) is the predictive measure of association between the optimal split on variable i and a surrogate split on variable j. See Predictive Measure of Association.

  • ma = meanSurrVarAssoc(tree,N) returns a P-by-P representing the predictive measure of association between variables averaged over nodes in the vector N. N contains node numbers from 1 to max(tree.NumNodes).

Definitions

Predictive Measure of Association

The predictive measure of association between the optimal split on variable i and a surrogate split on variable j is:

Here

  • PL and PR are the node probabilities for the optimal split of node i into Left and Right nodes respectively.

  • is the probability that both (optimal) node i and (surrogate) node j send an observation to the Left.

  • is the probability that both (optimal) node i and (surrogate) node j send an observation to the Right.

Clearly, λi,j lies from –∞ to 1. Variable j is a worthwhile surrogate split for variable i if λi,j>0.

Element ma(i,j) is the predictive measure of association averaged over surrogate splits on predictor j for which predictor i is the optimal split predictor. This average is computed by summing positive values of the predictive measure of association over optimal splits on predictor i and surrogate splits on predictor j and dividing by the total number of optimal splits on predictor i, including splits for which the predictive measure of association between predictors i and j is negative.

Examples

Find the mean predictive measure of association between the variables in the Fisher iris data:

load fisheriris
obj = fitctree(meas,species,'surrogate','on');
msva = meanSurrVarAssoc(obj)

msva =
    1.0000         0         0         0
         0    1.0000         0         0
    0.4633    0.2500    1.0000    0.5000
    0.2065    0.1413    0.4022    1.0000

Find the mean predictive measure of association averaged over the odd-numbered nodes in obj:

N = 1:2:obj.NumNodes;
msva = meanSurrVarAssoc(obj,N)

msva =
    1.0000         0         0         0
         0    1.0000         0         0
    0.7600    0.5000    1.0000    1.0000
    0.4130    0.2826    0.8043    1.0000

See Also

|

Was this topic helpful?