tree = fitrtree(x,y) returns
a regression tree based on the input variables (also known as predictors,
features, or attributes) x and output (response) y.
The returned tree is a binary tree where each branching node is split
based on the values of a column of x.

tree = fitrtree(x,y,Name,Value) fits
a tree with additional options specified by one or more name-value
pair arguments. For example, you can grow a cross-validated tree,
hold out a fraction of data for validation, or specify observation
weights.

Predictor values, specified as a matrix of scalar values. Each
column of x represents one variable, and each
row represents one observation.

fitrtree considers NaN values
in x as missing values. fitrtree does
not use observations with all missing values for x in
the fit. fitrtree uses observations with some
missing values for x to find splits on variables
for which these observations have valid values.

Response values, specified as a vector of scalar values with
the same number of rows as x. Each entry in y is
the response to the data in the corresponding row of x.

fitrtree considers NaN values
in y to be missing values. fitrtree does
not use observations with missing values for y in
the fit.

Data Types: single | double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments.
Name is the argument
name and Value is the corresponding
value. Name must appear
inside single quotes (' ').
You can specify several name and value pair
arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'CrossVal','on','MinParentSize',30 specifies
a cross-validated regression tree with a minimum of 30 observations
per branch node.

Categorical predictors list, specified as the comma-separated
pair consisting of 'CategoricalPredictors' and
one of the following:

A numeric vector with indices from 1 through p,
where p is the number of columns of x.

A logical vector of length p, where
a true entry means that the corresponding column
of x is a categorical variable.

A cell array of strings, where each element in the
array is the name of a predictor variable. The names must match entries
in the PredictorNames property.

A character matrix, where each row of the matrix is
a name of a predictor variable. Pad the names with extra blanks so
each row of the character matrix has the same length.

'all', meaning all predictors are
categorical.

Data Types: single | double | logical | char | cell

Cross-validation flag, specified as the comma-separated pair
consisting of 'CrossVal' and either 'on' or 'off'.

If 'on', fitrtree grows
a cross-validated decision tree with 10 folds. You can override this
cross-validation setting using one of the 'KFold', 'Holdout', 'Leaveout',
or 'CVPartition' name-value pair arguments. You
can only use one of these arguments at a time when creating a cross-validated
tree.

Alternatively, cross validate tree later
using the crossval method.

Fraction of data used for holdout validation, specified as the
comma-separated pair consisting of 'Holdout' and
a scalar value in the range [0,1]. Holdout validation
tests the specified fraction of the data, and uses the rest of the
data for training.

If you use 'Holdout', you cannot use any
of the 'CVPartition', 'KFold',
or 'Leaveout' name-value pair arguments.

Leave-one-out cross-validation flag, specified as the comma-separated
pair consisting of 'Leaveout' and either 'on' or 'off.
Specify 'on' to use leave-one-out cross validation.

If you use 'Leaveout', you cannot use any
of the 'CVPartition', 'Holdout',
or 'KFold' name-value pair arguments.

Leaf merge flag, specified as the comma-separated pair consisting
of 'MergeLeaves' and either 'on' or 'off'.

If MergeLeaves is 'on',
then fitrtree merges leaves that originate
from the same parent node, and that give a sum of risk values greater
or equal to the risk associated with the parent node. Otherwise, fitrtree does
not merge leaves.

Minimum number of leaf node observations, specified as the comma-separated
pair consisting of 'MinLeafSize' and a positive
integer value. Each leaf has at least MinLeafSize observations
per tree leaf. If you supply both MinParentSize and MinLeafSize, fitrtree uses the setting that gives larger
leaves: MinParentSize = max(MinParentSize,2*MinLeafSize).

Minimum number of branch node observations, specified as the
comma-separated pair consisting of 'MinParentSize' and
a positive integer value. Each branch node in the tree has at least MinParentSize observations.
If you supply both MinParentSize and MinLeafSize, fitrtree uses the setting that gives larger
leaves: MinParentSize = max(MinParentSize,2*MinLeafSize).

Number of predictors to select at random for each split, specified
as the comma-separated pair consisting of 'NumVariablesToSample' and
a positive integer value. You can also specify 'all' to
use all available predictors.

Predictor variable names, specified as the comma-separated pair
consisting of 'PredictorNames' and a cell array
of strings containing the names for the predictor variables, in the
order in which they appear in x.

Flag to estimate the optimal sequence of pruned subtrees, specified
as the comma-separated pair consisting of 'Prune' and
either 'on' or 'off'.

If Prune is 'on', then fitrtree grows
the regression tree and estimates the optimal sequence of pruned subtrees,
but does not prune the regression tree. Otherwise, fitrtree grows
the regression tree without estimating the optimal sequence of pruned
subtrees.

To prune a trained regression tree, pass the regression tree
to prune.

Quadratic error tolerance per node, specified as the comma-separated
pair consisting of 'QuadraticErrorTolerance' and
a positive scalar value. Splitting nodes stops when the quadratic
error per node drops below QuadraticErrorTolerance*QED,
where QED is the quadratic error for all data computed
before the decision tree is grown.

Response variable name, specified as the comma-separated pair
consisting of 'ResponseName' and a string containing
the name of the response variable in y.

Response transform function for transforming the raw response
values, specified as the comma-separated pair consisting of 'ResponseTransform' and
either a function handle or 'none'. The function
handle should accept a matrix of response values and return a matrix
of the same size. The default string 'none' means @(x)x,
or no transformation.

Add or change a ResponseTransform function
using dot notation:

Surrogate decision splits flag, specified as the comma-separated
pair consisting of 'Surrogate' and one of 'on', 'off', 'all',
or a positive integer value.

When 'on', fitrtree finds
at most 10 surrogate splits at each branch node.

When set to a positive integer value, fitrtree finds at most the specified number
of surrogate splits at each branch node.

When set to 'all', fitrtree finds all surrogate splits at
each branch node. The 'all' setting can use considerable
time and memory.

Use surrogate splits to improve the accuracy of predictions
for data with missing values. The setting also lets you compute measures
of predictive association between predictors.

Observation weights, specified as the comma-separated pair consisting
of 'Weights' and a vector of scalar values. The
length of Weights is the number of rows in x.

Regression tree, returned as a regression tree object. Note
that using the 'Crossval', 'KFold', 'Holdout', 'Leaveout',
or 'CVPartition' options results in a tree of class RegressionPartitionedModel.
You cannot use a partitioned tree for prediction, so this kind of
tree does not have a predict method.

Otherwise, tree is of class RegressionTree, and
you can use the predict method to make predictions.

By default, Prune is 'on'.
However, this specification does not prune the regression tree. To
prune a trained regression tree, pass the regression tree to prune.