crossval

Class: ClassificationSVM

Cross-validated support vector machine classifier

Syntax

  • CVSVMModel = crossval(SVMModel) example
  • CVSVMModel = crossval(SVMModel,Name,Value) example

Description

example

CVSVMModel = crossval(SVMModel) returns a cross-validated (partitioned) support vector machine classifier (CVSVMModel) from a trained SVM classifier (SVMModel).

By default, crossval uses 10-fold cross validation on the training data to create CVSVMModel.

example

CVSVMModel = crossval(SVMModel,Name,Value) returns a partitioned SVM classifier with additional options specified by one or more Name,Value pair arguments.

For example, you can specify the number of folds or holdout sample proportion.

Tips

Assess the predictive performance of SVMModel on cross-validated data using the "kfold" methods and properties of CVSVMModel, such as kfoldLoss.

Input Arguments

expand all

SVMModel — Full, trained SVM classifierClassificationSVM classifier

Full, trained SVM classifier, specified as a ClassificationSVM model trained using fitcsvm.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

'CVPartition' — Cross-validation partition[] (default) | cvpartition partition object

Cross-validation partition, specified as the comma-separated pair consisting of 'CVPartition' and a cvpartition partition object as created by cvpartition. The partition object specifies the type of cross-validation, and also the indexing for training and validation sets.

If you specify CVPartition, then you cannot specify any of Holdout, KFold, or Leaveout.

'Holdout' — Fraction of data for holdout validationscalar value in the range (0,1)

Fraction of data used for holdout validation, specified as the comma-separated pair consisting of 'Holdout' and a scalar value in the range (0,1). If you specify 'Holdout',p, then the software:

  1. Randomly reserves p*100% of the data as validation data, and trains the model using the rest of the data

  2. Stores the compact, trained model in CVMdl.Trained

If you specify Holdout, then you cannot specify any of CVPartition, KFold, or Leaveout.

Example: 'Holdout',0.1

Data Types: double | single

'KFold' — Number of folds10 (default) | positive integer value

Number of folds to use in a cross-validated classifier, specified as the comma-separated pair consisting of 'KFold' and a positive integer value. If you specify, e.g., 'KFold',k, then the software:

  1. Randomly partitions the data into k sets

  2. For each set, reserves the set as validation data, and trains the model using the other k – 1 sets

  3. Stores the k compact, trained models in the cells of a k-by-1 cell vector in CVMdl.Trained

If you specify KFold, then you cannot specify any of CVPartition, Holdout, or Leaveout.

Example: 'KFold',8

Data Types: double

'Leaveout' — Leave-one-out cross-validation flag'off' (default) | 'on'

Leave-one-out cross-validation flag, specified as the comma-separated pair consisting of 'Leaveout' and 'on' or 'off'. If you specify 'Leaveout','on', then, for each of the n observations, where n is size(Mdl.X,1), the software:

  1. Reserves the observation as validation data, and trains the model using the other n – 1 observations

  2. Stores the n compact, trained models in CVMdl.Trained

If you specify Leaveout, then you cannot specify CVPartition, Holdout, or KFold.

Example: 'Leaveout','on'

Data Types: char

Output Arguments

expand all

CVSVMModel — Cross-validated SVM classifierClassificationPartitionedModel classifier

Cross-validated SVM classifier, returned as a ClassificationPartitionedModel classifier.

Examples

expand all

Cross Validate an SVM Classifier Using crossval

Load the ionosphere data set.

load ionosphere
rng(1); % For reproducibility

Train an SVM classifier. It is good practice to standardize the predictors and define the class order.

SVMModel = fitcsvm(X,Y,'Standardize',true,'ClassNames',{'b','g'});

SVMModel is a trained ClassificationSVM classifier. 'b' is the negative class and 'g' is the positive class.

Cross validate the classifier using 10-fold cross validation.

CVSVMModel = crossval(SVMModel)
FirstModel = CVSVMModel.Trained{1}
CVSVMModel = 

  classreg.learning.partition.ClassificationPartitionedModel
    CrossValidatedModel: 'SVM'
         PredictorNames: {1x34 cell}
           ResponseName: 'Y'
        NumObservations: 351
                  KFold: 10
              Partition: [1x1 cvpartition]
             ClassNames: {'b'  'g'}
         ScoreTransform: 'none'



FirstModel = 

  classreg.learning.classif.CompactClassificationSVM
         PredictorNames: {1x34 cell}
           ResponseName: 'Y'
             ClassNames: {'b'  'g'}
         ScoreTransform: 'none'
                  Alpha: [78x1 double]
                   Bias: -0.2210
       KernelParameters: [1x1 struct]
                     Mu: [1x34 double]
                  Sigma: [1x34 double]
         SupportVectors: [78x34 double]
    SupportVectorLabels: [78x1 double]


CVSVMModel is a ClassificationPartitionedModel cross-validated classifier. The software:

  1. Randomly partitions the data into 10, equally sized sets.

  2. Trains an SVM classifier on nine of the sets.

  3. Repeats steps 1 and 2 k = 10 times. It leaves out one of the partitions each time, and trains on the other nine partitions.

  4. Combines generalization statistics for each fold.

FirstModel is the first of the 10 trained classifiers. It is a CompactClassificationSVM classifier.

You can estimate the generalization error by passing CVSVMModel to kfoldLoss.

Specify a Holdout-Sample Proportion for SVM Cross Validation

By default, crossval uses 10-fold cross validation to cross validate an SVM classifier. You have several other options, such as specifying a different number of folds or holdout sample proportion. This example shows how to specify a holdout-sample proportion.

Load the ionosphere data set.

load ionosphere
rng(1); % For reproducibility

Train an SVM classifier. It is good practice to standardize the predictors and define the class order.

SVMModel = fitcsvm(X,Y,'Standardize',true,'ClassNames',{'b','g'});

SVMModel is a trained ClassificationSVM classifier. 'b' is the negative class and 'g' is the positive class.

Cross validate the classifier by specifying a 15% holdout sample.

CVSVMModel = crossval(SVMModel,'Holdout',0.15)
TrainedModel = CVSVMModel.Trained{1}
CVSVMModel = 

  classreg.learning.partition.ClassificationPartitionedModel
    CrossValidatedModel: 'SVM'
         PredictorNames: {1x34 cell}
           ResponseName: 'Y'
        NumObservations: 351
                  KFold: 1
              Partition: [1x1 cvpartition]
             ClassNames: {'b'  'g'}
         ScoreTransform: 'none'



TrainedModel = 

  classreg.learning.classif.CompactClassificationSVM
         PredictorNames: {1x34 cell}
           ResponseName: 'Y'
             ClassNames: {'b'  'g'}
         ScoreTransform: 'none'
                  Alpha: [74x1 double]
                   Bias: -0.2952
       KernelParameters: [1x1 struct]
                     Mu: [1x34 double]
                  Sigma: [1x34 double]
         SupportVectors: [74x34 double]
    SupportVectorLabels: [74x1 double]


CVSVMModel is a ClassificationPartitionedModel. TrainedModel is a CompactClassificationSVM classifier trained using 85% of the data.

Estimate the generalization error.

kfoldLoss(CVSVMModel)
ans =

    0.0769

The out-of-sample misclassification error is approximately 8%.

Alternatives

Instead of training an SVM classifier and then cross-validating it, you can create a cross-validated classifier directly using fitcsvm and by specifying any of these name-value pair arguments: 'CrossVal', 'CVPartition', 'Holdout', 'Leaveout', or 'KFold'.

Was this topic helpful?