Estimate loss using cross-validation

returns a 10-fold cross-validation error estimate for the function
`err`

= crossval(`criterion`

,`X`

,`y`

,'Predfun',`predfun`

)`predfun`

based on the specified `criterion`

,
either `'mse'`

(mean squared error) or `'msc'`

(misclassification rate). The rows of `X`

and `y`

correspond to observations, and the columns of `X`

correspond to
predictor variables.

In this case, `crossval`

performs 10-fold cross-validation as follows:

Split the observations in the predictor data

`X`

and the response variable`y`

into 10 groups, each of which has approximately the same number of observations.Use the last nine groups of observations to train a model as specified in

`predfun`

. Use the first group of observations as test data, pass the test predictor data to the trained model, and compute predicted values as specified in`predfun`

. Compute the error specified by`criterion`

.Use the first group and the last eight groups of observations to train a model as specified in

`predfun`

. Use the second group of observations as test data, pass the test data to the trained model, and compute predicted values as specified in`predfun`

. Compute the error specified by`criterion`

.Proceed in a similar manner until each group of observations is used as test data exactly once.

Return the mean error estimate as the scalar

`err`

.

performs 10-fold cross-validation for the function `values`

= crossval(`fun`

,`X`

)`fun`

, applied to
the data in `X`

. The rows of `X`

correspond to
observations, and the columns of `X`

correspond to variables.

`crossval`

typically performs 10-fold cross-validation as follows:

Split the data in

`X`

into 10 groups, each of which has approximately the same number of observations.Use the last nine groups of data to train a model as specified in

`fun`

. Use the first group of data as a test set, pass the test set to the trained model, and compute some value (for example, loss) as specified in`fun`

.Use the first group and the last eight groups of data to train a model as specified in

`fun`

. Use the second group of data as a test set, pass the test set to the trained model, and compute some value as specified in`fun`

.Proceed in a similar manner until each group of data is used as a test set exactly once.

Return the 10 computed values as the vector

`values`

.

`___ = crossval(___,`

specifies cross-validation options using one or more name-value pair arguments in addition
to any of the input argument combinations and output arguments in previous syntaxes. For
example, `Name,Value`

)`'KFold',5`

specifies to perform 5-fold cross-validation.

A good practice is to use stratification (see

`Stratify`

) when you use cross-validation with classification algorithms. Otherwise, some test sets might not include observations for all classes.

Many classification and regression functions allow you to perform cross-validation directly.

When you use fit functions such as

`fitcsvm`

,`fitctree`

, and`fitrtree`

, you can specify cross-validation options by using name-value pair arguments. Alternatively, you can first create models with these fit functions and then create a partitioned object by using the`crossval`

object function. Use the`kfoldLoss`

and`kfoldPredict`

object functions to compute the loss and predicted values for the partitioned object. For more information, see`ClassificationPartitionedModel`

and`RegressionPartitionedModel`

.You can also specify cross-validation options when you perform lasso or elastic net regularization using

`lasso`

and`lassoglm`

.

`classify`

| `confusionmat`

| `cvpartition`

| `kmeans`

| `pca`

| `regress`