Documentation

This is machine translation

Translated by Microsoft
Mouse over text to see original. Click the button below to return to the English verison of the page.

cvpartition

Class: cvpartition

Create cross validation partition for data

Syntax

c = cvpartition(n,'KFold',k)
c = cvpartition(group,'KFold',k)
c = cvpartition(n,'HoldOut',p)
c = cvpartition(group,'HoldOut',p)
c = cvpartition(n,'LeaveOut')
c = cvpartition(n,'resubstitution')

Description

c = cvpartition(n,'KFold',k) constructs an object c of the cvpartition class defining a random partition for k-fold cross validation on n observations. The partition divides the observations into k disjoint subsamples (or folds), chosen randomly but with roughly equal size. The default value of k is 10.

c = cvpartition(group,'KFold',k) creates a random partition for a stratified k-fold cross validation. group is a numeric vector, categorical array, character array, or cell array of character vectors indicating the class of each observation. Each subsample has roughly equal size and roughly the same class proportions as in group. cvpartition treats NaNs or empty character vectors in group as missing values.

c = cvpartition(n,'HoldOut',p) creates a random partition for holdout validation on n observations. This partition divides the observations into a training set and a test (or holdout) set. The parameter p must be a scalar. When 0 < p < 1, cvpartition randomly selects approximately p*n observations for the test set. When p is an integer, cvpartition randomly selects p observations for the test set. The default value of p is 1/10.

c = cvpartition(group,'HoldOut',p) randomly partitions observations into a training set and a test set with stratification, using the class information in group; that is, both training and test sets have roughly the same class proportions as in group.

c = cvpartition(n,'LeaveOut') creates a random partition for leave-one-out cross validation on n observations. Leave-one-out is a special case of 'KFold', in which the number of folds equals the number of observations.

c = cvpartition(n,'resubstitution') creates an object c that does not partition the data. Both the training set and the test set contain all of the original n observations.

Examples

Use stratified 10-fold cross validation to compute misclassification rate:

load fisheriris;
y = species;
c = cvpartition(y,'k',10);

fun = @(xT,yT,xt,yt)(sum(~strcmp(yt,classify(xt,xT,yT))));

rate = sum(crossval(fun,meas,y,'partition',c))...
           /sum(c.TestSize)
rate =
    0.0200

Definitions

Tall Array Support

This function supports tall arrays for out-of-memory data with some limitations.

  • For tall arrays only stratified-HoldOut partitions are supported.

  • c = cvpartition(group,'HoldOut',p) randomly partitions observations into a training set and a test set with stratification, using the class information in group. P is a scalar such that 0 < P < 1.

  • To obtain nonstratified partitions, set a uniform grouping variable from the data samples. For example, assuming X is a tall numeric array, you can use

    groups = X(:,1).*0;
    C = cvpartition(groups,'HoldOut',P)

For more information, see Tall Arrays.

Introduced in R2008a

Was this topic helpful?