svmtrain - Train support vector machine classifier

Syntax

SVMStruct = svmtrain(Training, Group)

SVMStruct = svmtrain(..., 'Kernel_Function', Kernel_FunctionValue, ...)
SVMStruct = svmtrain(..., 'RBF_Sigma', RBFSigmaValue, ...)
SVMStruct = svmtrain(..., 'Polyorder', PolyorderValue, ...)
SVMStruct = svmtrain(..., 'Mlp_Params', Mlp_ParamsValue, ...)
SVMStruct = svmtrain(..., 'Method', MethodValue, ...)
SVMStruct = svmtrain(..., 'QuadProg_Opts', QuadProg_OptsValue, ...)
SVMStruct = svmtrain(..., 'SMO_Opts', SMO_OptsValue, ...)
SVMStruct = svmtrain(..., 'BoxConstraint', BoxConstraintValue, ...)
SVMStruct = svmtrain(..., 'Autoscale', AutoscaleValue, ...)
SVMStruct = svmtrain(..., 'Showplot', ShowplotValue, ...)

Arguments

TrainingMatrix of training data, where each row corresponds to an observation or replicate, and each column corresponds to a feature or variable.
GroupColumn vector, character array, or cell array of strings for classifying data in Training into two groups. It has the same number of elements as there are rows in Training. Each element specifies the group to which the corresponding row in Training belongs.
Kernel_FunctionValueString or function handle specifying the kernel function that maps the training data into kernel space. Choices are:
  • linear — Default. Linear kernel or dot product.

  • quadratic — Quadratic kernel.

  • rbf — Gaussian Radial Basis Function kernel with a default scaling factor, sigma, of 1.

  • polynomial — Polynomial kernel with a default order of 3.

  • mlp — Multilayer Perceptron kernel with default scale and bias parameters of [1, -1].

  • @functionname — Handle to a kernel function specified using @and the functionname. For example, @kfun, or an anonymous function.

RBFSigmaValuePositive number that specifies the scaling factor, sigma, in the radial basis function kernel. Default is 1.
PolyorderValuePositive number that specifies the order of a polynomial kernel. Default is 3.
Mlp_ParamsValueTwo-element vector, [p1, p2], that specifies the scale and bias parameters of the multilayer perceptron (mlp) kernel. K = tanh(p1*U*V' + p2). p1 must be > 0, and p2 must be < 0. Default is [1, -1].
MethodValueString specifying the method to find the separating hyperplane. Choices are:
  • QP — Quadratic Programming (requires the Optimization Toolbox™ software). The classifier is a two-norm, soft-margin support vector machine.

  • SMO — Sequential Minimal Optimization. The classifier is a one-norm, soft-margin support vector machine.

  • LS — Least-Squares.

If you installed the Optimization Toolbox software, the QP method is the default. Otherwise, the SMO method is the default.

QuadProg_OptsValueAn options structure created by the optimset function (Optimization Toolbox software). This structure specifies options used by the QP method. For more information on creating this structure, see the optimset and quadprog reference pages.
SMO_OptsValueAn options structure created by the svmsmoset function. This structure specifies options used by the SMO method. For more information on creating this structure, see the svmsmoset function.
BoxConstraintValueBox constraints for the soft margin. Choices are:
  • Strictly positive numeric scalar.

  • Array of strictly positive values with the number of elements equal to the number of rows in the Training matrix.

If BoxConstraintValue is a scalar, it is automatically rescaled by N/(2*N1) for the data points of group one and by N/(2*N2) for the data points of group two. N1 is the number of elements in group one, N2 is the number of elements in group two, and N = N1 + N2. This rescaling is done to take into account unbalanced groups, that is cases where N1 and N2 have very different values.

If BoxConstraintValue is an array, then each array element is taken as a box constraint for the data point with the same index.

Default is a scalar value of 1.
AutoscaleValueControls the shifting and scaling of data points before training. When AutoscaleValue is true, the columns of the input data matrix Training are shifted to zero mean and scaled to unit variance. Default is true.
ShowplotValueControls the display of a plot of the grouped data, including the separating line for the classifier, when using two-dimensional data. Choices are true or false (default).

Return Values

SVMStructStructure containing information about the trained SVM classifier, including the following fields:
  • SupportVectors

  • Alpha

  • Bias

  • KernelFunction

  • KernelFunctionArgs

  • GroupNames

  • SupportVectorIndices

  • ScaleData

  • FigureHandles

    Tip   You can use SVMStruct as input to the svmclassify function, to use for classification.

Description

SVMStruct = svmtrain(Training, Group) trains a support vector machine (SVM) classifier using Training, a matrix of training data taken from two groups, specified by Group. svmtrain treats NaNs or empty strings in Group as missing values and ignores the corresponding rows of Training. Information about the trained SVM classifier is returned in SVMStruct, a structure with the following fields.

SVMStruct = svmtrain(Training, Group, ...'PropertyName', PropertyValue, ...) calls svmtrain with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:


SVMStruct = svmtrain(..., 'Kernel_Function', Kernel_FunctionValue, ...)
specifies the kernel function (Kernel_FunctionValue) that maps the training data into kernel space. Kernel_FunctionValue can be one of the following strings or a function handle:

A kernel function must be of the following form:

function K = kfun(U, V)

Input arguments U and V are matrices with m and n rows respectively. Return value K is an m-by-n matrix. If kfun is parameterized, you can use anonymous functions to capture the problem-dependent parameters. For example, suppose that your kernel function is:

function K = kfun(U,V,P1,P2)
K = tanh(P1*(U*V')+P2);

You can set values for P1 and P2 and then use an anonymous function as follows:

@(U,V) kfun(U,V,P1,P2)

For more information on the types of functions that can be used as kernel functions, see Cristianini and Shawe-Taylor, 2000.

SVMStruct = svmtrain(..., 'RBF_Sigma', RBFSigmaValue, ...) specifies the scaling factor, sigma, in the radial basis function kernel. RBFSigmaValue must be a positive number. Default is 1.

SVMStruct = svmtrain(..., 'Polyorder', PolyorderValue, ...) specifies the order of a polynomial kernel. PolyorderValue must be a positive number. Default is 3.

SVMStruct = svmtrain(..., 'Mlp_Params', Mlp_ParamsValue, ...) specifies the scale and bias parameters of the multilayer perceptron (mlp) kernel as a two-element vector, [p1, p2]. K = tanh(p1*U*V' + p2), p1 > 0, and p2 < 0. p1 must be > 0, and p2 must be < 0. Default is [1, -1].

SVMStruct = svmtrain(..., 'Method', MethodValue, ...) specifies the method to find the separating hyperplane. Choices are:

If you installed the Optimization Toolbox software, the QP method is the default. Otherwise, the SMO method is the default.

SVMStruct = svmtrain(..., 'QuadProg_Opts', QuadProg_OptsValue, ...) specifies an options structure created by the optimset function (Optimization Toolbox software). This structure specifies options used by the QP method. For more information on creating this structure, see the optimset and quadprog functions.

SVMStruct = svmtrain(..., 'SMO_Opts', SMO_OptsValue, ...) specifies an options structure created by svmsmoset function. This structure specifies options used by the SMO method. For more information on creating this structure, see the svmsmoset function.

SVMStruct = svmtrain(..., 'BoxConstraint', BoxConstraintValue, ...) specifies box constraints for the soft margin. BoxConstraintValue can be either of the following:

If BoxConstraintValue is a scalar, it is automatically rescaled by N/(2*N1) for the data points of group one and by N/(2*N2) for the data points of group two. N1 is the number of elements in group one, N2 is the number of elements in group two, and N = N1 + N2. This rescaling is done to take into account unbalanced groups, that is cases where N1 and N2 have very different values.

If BoxConstraintValue is an array, then each array element is taken as a box constraint for the data point with the same index.

Default is a scalar value of 1.

SVMStruct = svmtrain(..., 'Autoscale', AutoscaleValue, ...) controls the shifting and scaling of data points before training. When AutoscaleValue is true, the columns of the input data matrix Training are shifted to zero mean and scaled to unit variance. Default is true.

SVMStruct = svmtrain(..., 'Showplot', ShowplotValue, ...), controls the display of a plot of the grouped data , including the separating line for the classifier, when using two-dimensional data. Choices are true or false (default).

Memory Usage and Out of Memory Error

When you set 'Method' to 'QP', the svmtrain function operates on a data set containing N elements, it creates an (N+1)-by-(N+1) matrix to find the separating hyperplane. This matrix needs at least 8*(n+1)^2 bytes of contiguous memory. If this size of contiguous memory is not available, the software displays an "out of memory" message.

When you set 'Method' to 'SMO', memory consumption is controlled by the SMO option KernelCacheLimit. For more information on the KernelCacheLimit option, see the svmsmoset function. The SMO algorithm stores only a submatrix of the kernel matrix, limited by the size specified by the KernelCacheLimit option. However, if the number of data points exceeds the size specified by the KernelCacheLimit option, the SMO algorithm slows down because it has to recalculate the kernel matrix elements.

When using svmtrain on large data sets, and you run out of memory or the optimization step is very time consuming, try either of the following:

Examples

  1. Load the sample data, which includes Fisher's iris data of 5 measurements on a sample of 150 irises.

    load fisheriris
    
  2. Create data, a two-column matrix containing sepal length and sepal width measurements for 150 irises.

    data = [meas(:,1), meas(:,2)];
    
  3. From the species vector, create a new column vector, groups, to classify data into two groups: Setosa and non-Setosa.

    groups = ismember(species,'setosa');
  4. Randomly select training and test sets.

    [train, test] = crossvalind('holdOut',groups);
    cp = classperf(groups);
  5. Train an SVM classifier using a linear kernel function and plot the grouped data.

    svmStruct = svmtrain(data(train,:),groups(train),'showplot',true);
    

  6. Add a title to the plot, using the KernelFunction field from the svmStruct structure as the title.

    title(sprintf('Kernel Function: %s',...
                  func2str(svmStruct.KernelFunction)),...
                  'interpreter','none');

  7. Use the svmclassify function to classify the test set.

    classes = svmclassify(svmStruct,data(test,:),'showplot',true);

  8. Evaluate the performance of the classifier.

    classperf(cp,classes,test);
    cp.CorrectRate
    
    ans =
    
        0.9867
  9. Use a one-norm, hard margin support vector machine classifier by changing the boxconstraint property.

    figure
    svmStruct = svmtrain(data(train,:),groups(train),...
                         'showplot',true,'boxconstraint',1e6);
    

    classes = svmclassify(svmStruct,data(test,:),'showplot',true);

  10. Evaluate the performance of the classifier.

    classperf(cp,classes,test);
    cp.CorrectRate
    
    ans =
    
        0.9867

References

[1] Kecman, V. (2001). Learning and Soft Computing (Cambridge, MA: MIT Press).

[2] Suykens, J.A.K., Van Gestel, T., De Brabanter, J., De Moor, B., and Vandewalle, J. (2002). Least Squares Support Vector Machines (Singapore: World Scientific).

[3] Scholkopf, B., and Smola, A.J. (2002). Learning with Kernels (Cambridge, MA: MIT Press).

[4] Cristianini, N. and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, First Edition (Cambridge: Cambridge University Press). http://www.support-vector.net/

See Also

Bioinformatics Toolbox™ functions: knnclassify, svmclassify, svmsmoset

Statistics Toolbox™ function: classify

Optimization Toolbox function: quadprog

MATLAB® function: optimset

  


 © 1984-2008- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS