tree = fitrtree(x,y) returns
a regression tree based on the input variables (also known as predictors,
features, or attributes) x and output (response) y.
The returned tree is a binary tree where each branching node is split
based on the values of a column of x.

tree = fitrtree(x,y,Name,Value) fits
a tree with additional options specified by one or more name-value
pair arguments. For example, you can grow a cross-validated tree,
hold out a fraction of data for validation, or specify observation
weights.

Predictor values, specified as a matrix of scalar values. Each
column of x represents one variable, and each
row represents one observation.

fitrtree considers NaN values
in x as missing values. fitrtree does
not use observations with all missing values for x in
the fit. fitrtree uses observations with some
missing values for x to find splits on variables
for which these observations have valid values.

Response values, specified as a vector of scalar values with
the same number of rows as x. Each entry in y is
the response to the data in the corresponding row of x.

fitrtree considers NaN values
in y to be missing values. fitrtree does
not use observations with missing values for y in
the fit.

Data Types: single | double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments.
Name is the argument
name and Value is the corresponding
value. Name must appear
inside single quotes (' ').
You can specify several name and value pair
arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'CrossVal','on','MinParent',30 specifies
a cross-validated regression tree with a minimum of 30 observations
per branch node.

Categorical predictors list, specified as the comma-separated
pair consisting of 'CategoricalPredictors' and
one of the following:

A numeric vector with indices from 1 through p,
where p is the number of columns of x.

A logical vector of length p, where
a true entry means that the corresponding column
of x is a categorical variable.

A cell array of strings, where each element in the
array is the name of a predictor variable. The names must match entries
in the PredictorNames property.

A character matrix, where each row of the matrix is
a name of a predictor variable. Pad the names with extra blanks so
each row of the character matrix has the same length.

'all', meaning all predictors are
categorical.

Data Types: single | double | logical | char | cell

Cross-validation flag, specified as the comma-separated pair
consisting of 'CrossVal' and either 'on' or 'off'.

If 'on', fitrtree grows
a cross-validated decision tree with 10 folds. You can override this
cross-validation setting using one of the 'KFold', 'Holdout', 'Leaveout',
or 'CVPartition' name-value pair arguments. You
can only use one of these arguments at a time when creating a cross-validated
tree.

Alternatively, cross validate tree later
using the crossval method.

Fraction of data used for holdout validation, specified as the
comma-separated pair consisting of 'Holdout' and
a scalar value in the range [0,1]. Holdout validation
tests the specified fraction of the data, and uses the rest of the
data for training.

If you use 'Holdout', you cannot use any
of the 'CVPartition', 'KFold',
or 'Leaveout' name-value pair arguments.

Leave-one-out cross-validation flag, specified as the comma-separated
pair consisting of 'Leaveout' and either 'on' or 'off.
Specify 'on' to use leave-one-out cross validation.

If you use 'Leaveout', you cannot use any
of the 'CVPartition', 'Holdout',
or 'KFold' name-value pair arguments.

Leaf merge flag, specified as the comma-separated pair consisting
of 'MergeLeaves' and either 'on' or 'off'.
If you specify 'on', then fitrtree merges
leaves that originate from the same parent node, and that give a sum
of risk values greater or equal to the risk associated with the parent
node. If you specify 'off', then fitrtree does not merge leaves.

Minimum number of leaf node observations, specified as the comma-separated
pair consisting of 'MinLeaf' and a positive integer
value. Each leaf has at least MinLeaf observations
per tree leaf. If you supply both MinParent and MinLeaf, fitrtree uses the setting that gives larger
leaves: MinParent=max(MinParent,2*MinLeaf).

Minimum number of branch node observations, specified as the
comma-separated pair consisting of 'MinParent' and
a positive integer value. Each branch node in the tree has at least MinParent observations.
If you supply both MinParent and MinLeaf, fitrtree uses the setting that gives larger
leaves: MinParent=max(MinParent,2*MinLeaf).

Number of predictors to select at random for each split, specified
as the comma-separated pair consisting of 'NVarToSample' and
a positive integer value. You can also specify 'all' to
use all available predictors.

Predictor variable names, specified as the comma-separated pair
consisting of 'PredictorNames' and a cell array
of strings containing the names for the predictor variables, in the
order in which they appear in x.

Pruning flag, specified as the comma-separated pair consisting
of 'Prune' and either 'on' or 'off'.
When 'on', fitrtree computes
the full tree and the optimal sequence of pruned subtrees. When 'off'fitrtree computes the full tree without
pruning.

Quadratic error tolerance per node, specified as the comma-separated
pair consisting of 'QEToler' and a positive scalar
value. Splitting nodes stops when the quadratic error per node drops
below QEToler*QED, where QED is
the quadratic error for all data computed before the decision tree
is grown.

Response variable name, specified as the comma-separated pair
consisting of 'ResponseName' and a string containing
the name of the response variable in y.

Response transform function for transforming the raw response
values, specified as the comma-separated pair consisting of 'ResponseTransform' and
either a function handle or 'none'. The function
handle should accept a matrix of response values and return a matrix
of the same size. The default string 'none' means @(x)x,
or no transformation.

Add or change a ResponseTransform function
using dot notation:

Surrogate decision splits flag, specified as the comma-separated
pair consisting of 'Surrogate' and one of 'on', 'off', 'all',
or a positive integer value.

When 'on', fitrtree finds
at most 10 surrogate splits at each branch node.

When set to a positive integer value, fitrtree finds at most the specified number
of surrogate splits at each branch node.

When set to 'all', fitrtree finds all surrogate splits at
each branch node. The 'all' setting can use considerable
time and memory.

Use surrogate splits to improve the accuracy of predictions
for data with missing values. The setting also lets you compute measures
of predictive association between predictors.

Observation weights, specified as the comma-separated pair consisting
of 'Weights' and a vector of scalar values. The
length of Weights is the number of rows in x.

Regression tree, returned as a regression tree object. Note
that using the 'Crossval', 'KFold', 'Holdout', 'Leaveout',
or 'CVPartition' options results in a tree of class RegressionPartitionedModel.
You cannot use a partitioned tree for prediction, so this kind of
tree does not have a predict method.

Otherwise, tree is of class RegressionTree, and
you can use the predict method to make predictions.