RegressionTree.fit will be removed in a future
release. Use fitrtree instead.

Syntax

tree =
RegressionTree.fit(x,y) tree = RegressionTree.fit(x,y,Name,Value)

Description

tree =
RegressionTree.fit(x,y) returns
a regression tree based on the input variables (also known as predictors,
features, or attributes) x and output (response) y. tree is
a binary tree where each branching node is split based on the values
of a column of x.

tree = RegressionTree.fit(x,y,Name,Value) fits
a tree with additional options specified by one or more Name,Value pair
arguments. You can specify several name-value pair arguments in any
order as Name1,Value1,…,NameN,ValueN.

Note that using the 'CrossVal', 'KFold', 'Holdout', 'Leaveout',
or 'CVPartition' options results in a tree of class RegressionPartitionedModel.
You cannot use a partitioned tree for prediction, so this kind of
tree does not have a predict method.

Otherwise, tree is of class RegressionTree, and
you can use the predict method to make predictions.

Predictor values, specified as a matrix of scalar values. Each
column of x represents one variable, and each
row represents one observation.

RegressionTree.fit considers NaN values
in x as missing values. RegressionTree.fit does
not use observations with all missing values for x the
fit. RegressionTree.fit uses observations with some
missing values for x to find splits on variables
for which these observations have valid values.

Response values, specified as a vector of scalar values with
the same number of rows as x. Each entry in y is
the response to the data in the corresponding row of x.

RegressionTree.fit considers NaN values
in y to be missing values. RegressionTree.fit does
not use observations with missing values for y in
the fit.

Data Types: single | double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments.
Name is the argument
name and Value is the corresponding
value. Name must appear
inside single quotes (' ').
You can specify several name and value pair
arguments in any order as Name1,Value1,...,NameN,ValueN.

Categorical predictors list, specified as the comma-separated
pair consisting of 'CategoricalPredictors' and
one of the following.

A numeric vector with indices from 1 to p,
where p is the number of columns of x.

A logical vector of length p, where
a true entry means that the corresponding column
of x is a categorical variable.

A cell array of strings, where each element in the
array is the name of a predictor variable. The names must match entries
in the PredictorNames property.

A character matrix, where each row of the matrix is
a name of a predictor variable. Pad the names with extra blanks so
each row of the character matrix has the same length.

'all', meaning all predictors are
categorical.

Example:

Data Types: single | double | logical | char | struct | cell

Cross-validation flag, specified as the comma-separated pair
consisting of 'CrossVal' and either 'on' or 'off'.

If 'on', RegressionTree.fit grows
a cross-validated decision tree with 10 folds. You can override this
cross-validation setting using one of the 'KFold', 'Holdout', 'Leaveout',
or 'CVPartition' name-value pair arguments. Note
that you can only use one of these four options ('KFold', 'Holdout', 'Leaveout',
or 'CVPartition') at a time when creating a cross-validated
tree.

Alternatively, cross-validate tree later
using the crossval method.

Partition for cross-validated tree, specified as the comma-separated
pair consisting of 'CVPartition' and an object
of the cvpartition class created using cvpartition.

Note that if you use 'CVPartition', you cannot
use any of the 'KFold', 'Holdout',
or 'Leaveout' name-value pair arguments.

Fraction of data used for holdout validation, specified as the
comma-separated pair consisting of 'Holdout' and
a scalar value in the range [0,1]. Holdout validation
tests the specified fraction of the data, and uses the rest of the
data for training.

Note that if you use 'Holdout', you cannot
use any of the 'CVPartition', 'KFold',
or 'Leaveout' name-value pair arguments.

Leave-one-out cross-validation flag, specified as the comma-separated
pair consisting of 'Leaveout' and either 'on' or 'off.
Use leave-one-out cross validation by setting to 'on'.

Note that if you use 'Leaveout', you cannot
use any of the 'CVPartition', 'Holdout',
or 'KFold' name-value pair arguments.

Leaf merge flag, specified as the comma-separated pair consisting
of 'MergeLeaves' and either 'on' or 'off'.
When 'on', RegressionTree.fit merges
leaves that originate from the same parent node, and that give a sum
of risk values greater or equal to the risk associated with the parent
node. When 'off', RegressionTree.fit does
not merge leaves.

Minimum number of leaf node observations, specified as the comma-separated
pair consisting of 'MinLeaf' and a positive integer
value. Each leaf has at least MinLeaf observations
per tree leaf. If you supply both MinParent and MinLeaf, RegressionTree.fit uses
the setting that gives larger leaves: MinParent=max(MinParent,2*MinLeaf).

Minimum number of branch node observations, specified as the
comma-separated pair consisting of 'MinParent' and
a positive integer value. Each branch node in the tree has at least MinParent observations.
If you supply both MinParent and MinLeaf, RegressionTree.fit uses
the setting that gives larger leaves: MinParent=max(MinParent,2*MinLeaf).

Number of predictors to select at random for each split, specified
as the comma-separated pair consisting of 'NVarToSample' and
a positive integer value. You can also specify 'all'to
use all available predictors.

Predictor variable names, specified as the comma-separated pair
consisting of 'PredictorNames' and a cell array
of strings containing the names for the predictor variables, in the
order in which they appear in x.

Pruning flag, specified as the comma-separated pair consisting
of 'Prune' and either 'on' or 'off'.
When 'on', RegressionTree.fit computes
the full tree and the optimal sequence of pruned subtrees. When 'off', RegressionTree.fit computes
the full tree without pruning.

Quadratic error tolerance per node, specified as the comma-separated
pair consisting of 'QEToler' and a positive scalar
value. Splitting nodes stops when quadratic error per node drops below QEToler*QED,
where QED is the quadratic error for the entire
data computed before the decision tree is grown.

Response variable name, specified as the comma-separated pair
consisting of 'ResponseName' and a string containing
the name of the response variable in y.

Response transform function for transforming the raw response
values, specified as the comma-separated pair consisting of 'ResponseTransform' and
either a function handle or 'none'. The function
handle should accept a matrix of response values and return a matrix
of the same size. The default string 'none' means @(x)x,
or no transformation.

Add or change a ResponseTransform function
using dot notation:

Surrogate decision splits flag, specified as the comma-separated
pair consisting of 'Surrogate' and 'on', 'off', 'all',
or a positive integer value.

When 'on', RegressionTree.fit finds
at most 10 surrogate splits at each branch node.

When set to a positive integer value, RegressionTree.fit finds
at most the specified number of surrogate splits at each branch node.

When set to 'all', RegressionTree.fit finds
all surrogate splits at each branch node. The 'all' setting
can use much time and memory.

Use surrogate splits to improve the accuracy of predictions
for data with missing values. The setting also enables you to compute
measures of predictive association between predictors.

Observation weights, specified as the comma-separated pair consisting
of 'Weights' and a vector of scalar values. The
length of Weights is the number of rows in x.

Regression tree, returned as a regression tree object. Note
that using the 'Crossval', 'KFold', 'Holdout', 'Leaveout',
or 'CVPartition' options results in a tree of class RegressionPartitionedModel.
You cannot use a partitioned tree for prediction, so this kind of
tree does not have a predict method.

Otherwise, tree is of class RegressionTree, and
you can use the predict method to make predictions.