| Products & Services | Industries | Academia | Support | User Community | Company |
| Download Product Updates | | | Get Pricing | | | Trial Software |
| Documentation → Statistics Toolbox |
| Contents | Index |
| Learn more about Statistics Toolbox |
vals = crossval(fun,X)
vals = crossval(fun,X,Y,...)
mse = crossval('mse',X,y,'Predfun',predfun)
mcr = crossval('mcr',X,y,'Predfun',predfun)
val = crossval(criterion,X1,X2,...,y,'Predfun',predfun)
vals = crossval(...,param1,val1,param2,val2,...)
vals = crossval(fun,X) performs 10-fold cross-validation for the function fun, applied to the data in X.
fun is a function handle to a function with two inputs, the training subset of X, XTRAIN, and the test subset of X, XTEST, as follows:
testval = fun(XTRAIN,XTEST)
Each time it is called, fun should use XTRAIN to fit a model, then return some criterion testval computed on XTEST using that fitted model.
X can be a column vector or a matrix. Rows of X correspond to observations; columns correspond to variables or features. Each row of vals contains the result of applying fun to one test set. If testval is a non-scalar value, crossval converts it to a row vector using linear indexing and stored in one row of vals.
vals = crossval(fun,X,Y,...) is used when data are stored in separate variables X, Y, ... . All variables (column vectors, matrices, or arrays) must have the same number of rows. fun is called with the training subsets of X, Y, ... , followed by the test subsets of X, Y, ... , as follows:
testvals = fun(XTRAIN,YTRAIN,...,XTEST,YTEST,...)
mse = crossval('mse',X,y,'Predfun',predfun) returns mse, a scalar containing a 10-fold cross-validation estimate of mean-squared error for the function predfun. X can be a column vector, matrix, or array of predictors. y is a column vector of response values. X and y must have the same number of rows.
predfun is a function handle called with the training subset of X, the training subset of y, and the test subset of X as follows:
yfit = predfun(XTRAIN,ytrain,XTEST)
Each time it is called, predfun should use XTRAIN and ytrain to fit a regression model and then return fitted values in a column vector yfit. Each row of yfit contains the predicted values for the corresponding row of XTEST. crossval computes the squared errors between yfit and the corresponding response test set, and returns the overall mean across all test sets.
mcr = crossval('mcr',X,y,'Predfun',predfun) returns mcr, a scalar containing a 10-fold cross-validation estimate of misclassification rate (the proportion of misclassified samples) for the function predfun. The matrix X contains predictor values and the vector y contains class labels. predfun should use XTRAIN and YTRAIN to fit a classification model and return yfit as the predicted class labels for XTEST. crossval computes the number of misclassifications between yfit and the corresponding response test set, and returns the overall misclassification rate across all test sets.
val = crossval(criterion,X1,X2,...,y,'Predfun',predfun), where criterion is 'mse' or 'mcr', returns a cross-validation estimate of mean-squared error (for a regression model) or misclassification rate (for a classification model) with predictor values in X1, X2, ... and, respectively, response values or class labels in y. X1, X2, ... and y must have the same number of rows. predfun is a function handle called with the training subsets of X1, X2, ..., the training subset of y, and the test subsets of X1, X2, ..., as follows:
yfit=predfun(X1TRAIN,X2TRAIN,...,ytrain,X1TEST,X2TEST,...)
yfit should be a column vector containing the fitted values.
vals = crossval(...,param1,val1,param2,val2,...) specifies optional parameter name/value pairs from the following table:
| Name | Value |
|---|---|
| 'holdout' | A scalar specifying the ratio or the number of observations p for holdout cross-validation. When 0 < p < 1, approximately p*n observations for the test set are randomly selected. When p is an integer, p observations for the test set are randomly selected. |
| 'kfold' | A scalar specifying the number of folds k for k-fold cross-validation. |
| 'leaveout' | Specifies leave-one-out cross-validation. The value must be 1. |
| 'mcreps' | A positive integer specifying the number of Monte-Carlo repetitions for validation. Ifthe first input of crossval is 'mse' or 'mcr', crossval returns the mean of mean-squared error or misclassification rate across all of the Monte-Carlo repetitions. Otherwise, crossval concatenates the values vals from all of the Monte-Carlo repetitions along the first dimension. |
| 'partition' | An object c of the cvpartition class, specifying the cross-validation type and partition. |
| 'stratify' | A column vector group specifying groups for stratification. Both training and test sets have roughly the same class proportions as in group. NaNs or empty strings in group are treated as missing values, and the corresponding rows of the data are ignored. |
| 'options' | A struct that specifies options that govern the computation
of crossval. One option requests that crossval conduct
multiple function evaluations using multiple processors, if the Parallel Computing Toolbox is
available. Two options specify the random number streams to use in
constructing randomized cvpartition objects. You
can create this argument with a call to statset You
can retrieve values of the individual fields with a call to statget. Applicable statset parameters
are:
|
Only one of 'kfold', 'holdout', 'leaveout', or 'partition' can be specified, and 'partition' cannot be specified with 'stratify'. If both 'partition' and 'mcreps' are specified, the first Monte-Carlo repetition uses the partition information in the cvpartition object, and the repartition method is called to generate new partitions for each of the remaining repetitions. If no cross-validation type is specified, the default is 10-fold cross-validation.
Note When using cross-validation with classification algorithms, stratification is preferred. Otherwise, some test sets may not include observations from all classes. |
Compute mean-squared error for regression using 10-fold cross-validation:
load('fisheriris');
y = meas(:,1);
X = [ones(size(y,1),1),meas(:,2:4)];
regf=@(XTRAIN,ytrain,XTEST)(XTEST*regress(ytrain,XTRAIN));
cvMse = crossval('mse',X,y,'predfun',regf)
cvMse =
0.1015Compute misclassification rate using stratified 10-fold cross-validation:
load('fisheriris');
y = species;
X = meas;
cp = cvpartition(y,'k',10); % Stratified cross-validation
classf = @(XTRAIN, ytrain,XTEST)(classify(XTEST,XTRAIN,...
ytrain));
cvMCR = crossval('mcr',X,y,'predfun',classf,'partition',cp)
cvMCR =
0.0200Compute the confusion matrix using stratified 10-fold cross-validation:
load('fisheriris');
y = species;
X = meas;
order = unique(y); % Order of the group labels
cp = cvpartition(y,'k',10); % Stratified cross-validation
f = @(xtr,ytr,xte,yte)confusionmat(yte,...
classify(xte,xtr,ytr),'order',order);
cfMat = crossval(f,X,y,'partition',cp);
cfMat = reshape(sum(cfMat),3,3)
cfMat =
50 0 0
0 48 2
0 1 49cfMat is the summation of 10 confusion matrices from 10 test sets.
[1] Hastie, T., R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. New York: Springer, 2001.
![]() | crosstab | ctranspose (categorical) | ![]() |

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.
| © 1984-2009- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |