| Products & Services | Solutions | Academia | Support | User Community | Company |
| Download Product Updates | | | Get Pricing | | | Trial Software |
| Documentation → Statistics Toolbox |
| Contents | Index |
| Learn more about Statistics Toolbox |
Construct classification and regression trees
t = classregtree(X,y)
t = classregtree(X,y,param1,val1,param2,val2)
t = classregtree(X,y) creates a decision tree t for predicting the response y as a function of the predictors in the columns of X. X is an n-by-m matrix of predictor values. If y is a vector of n response values, classregtree performs regression. If y is a categorical variable, character array, or cell array of strings, classregtree performs classification. Either way, t is a binary tree where each branching node is split based on the values of a column of X. NaN values in X or y are taken to be missing values. Observations with all missing values for X or missing values for y are not used in the fit. Observations with some missing values for X are used to find splits on variables for which these observations have valid values.
t = classregtree(X,y,param1,val1,param2,val2) specifies optional parameter name/value pairs, as follows.
For all trees:
'categorical' — Vector of indices of the columns of X that are to be treated as unordered categorical variables
'method' — Either 'classification' (default if y is text or a categorical variable) or 'regression' (default if y is numeric).
'names' — A cell array of names for the predictor variables, in the order in which they appear in the X from which the tree was created.
'prune' — 'on' (default) to compute the full tree and the optimal sequence of pruned subtrees, or 'off' for the full tree without pruning.
'minparent' — A number k such that impure nodes must have k or more observations to be split (default is 10).
'minleaf' — A minimal number of observations per tree leaf (default is 1). If you supply both 'minparent' and 'minleaf', classregtree uses the setting which results in larger leaves: minparent = max(minparent,2*minleaf)
'nvartosample' — Number of predictor variables randomly selected for each split. By default all variables are considered for each decision split.
'mergeleaves' — 'on' (default) to merge leaves that originate from the same parent node and give the sum of risk values greater or equal to the risk associated with the parent node. If 'off', classregtree does not merge leaves.
'weights' — Vector of observation weights. By default the weight of every observation is 1. The length of this vector must be equal to the number of rows in X.
For regression trees only:
'qetoler' — Defines tolerance on quadratic error per node for regression trees. Splitting nodes stops when quadratic error per node drops below qetoler*qed, where qed is the quadratic error for the entire data computed before the decision tree is grown: qed = norm(y-ybar) with ybar estimated as the average of the input array Y. Default value is 1e-6.
For classification trees only:
'cost' — Square matrix C, where C(i,j) is the cost of classifying a point into class j if its true class is i (default has C(i,j)=1 if i~=j, and C(i,j)=0 if i=j). Alternatively, this value can be a structure S having two fields: S.groupcontaining the group names as a categorical variable, character array, or cell array of strings; and S.cost containing the cost matrix C.
'splitcriterion' — Criterion for choosing a split. One of 'gdi' (default) or Gini's diversity index, 'twoing' for the twoing rule, or 'deviance' for maximum deviance reduction.
'priorprob' — Prior probabilities for each class, specified as a vector (one value for each distinct group name) or as a structure S with two fields: S.group containing the groupnames as a categorical variable, character array, or cell array of strings; and S.prob containing a vector of corresponding probabilities.
Create a classification tree for Fisher's iris data:
load fisheriris;
t = classregtree(meas,species,...
'names',{'SL' 'SW' 'PL' 'PW'})
t =
Decision tree for classification
1 if PL<2.45 then node 2 else node 3
2 class = setosa
3 if PW<1.75 then node 4 else node 5
4 if PL<4.95 then node 6 else node 7
5 class = virginica
6 if PW<1.65 then node 8 else node 9
7 class = virginica
8 class = versicolor
9 class = virginica
view(t)

[1] Breiman, L., J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Boca Raton, FL: CRC Press, 1984.
Regression and Classification by Bagging Decision Trees
![]() | classregtree class | CLevels property (NaiveBayes) | ![]() |

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.
| © 1984-2009- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |