loss

Class: CompactClassificationSVM

Classification error for support vector machine classifiers

Syntax

  • L = loss(SVMModel,X,Y) example
  • L = loss(SVMModel,X,Y,Name,Value) example

Description

example

L = loss(SVMModel,X,Y) returns the classification error (L), a scalar representing how well the trained support vector machine (SVM) classifer SVMModel classifies the predictor data (X) as compared to the true class labels (Y).

loss normalizes the class probabilities in Y to the prior class probabilities fitcsvm used for training, stored in the Prior property of SVMModel.

example

L = loss(SVMModel,X,Y,Name,Value) returns the classification error with additional options specified by one or more Name,Value pair arguments.

Input Arguments

expand all

SVMModel — SVM classifierClassificationSVM classifier | CompactClassificationSVM classifier

SVM classifier, specified as a ClassificationSVM classifier or CompactClassificationSVM classifier returned by fitcsvm or compact, respectively.

X — Predictor datanumeric matrix

Predictor data, specified as a numeric matrix.

Each row of X corresponds to one observation (also known as an instance or example), and each column corresponds to one variable (also known as a feature). The variables making up the columns of X should be the same as the variables that trained the SVMModel classifier.

The length of Y and the number of rows of X must be equal.

If you set 'Standardize',true in fitcsvm to train SVMModel, then the software standardizes the columns of X using the corresponding means in SVMModel.Mu and standard deviations in SVMModel.Sigma.

Data Types: double | single

Y — Class labelscategorical array | character array | logical vector | vector of numeric values | cell array of strings

Class labels, specified as a categorical or character array, logical or numeric vector, or cell array of strings. Y must be the same as the data type of SVMModel.ClassNames.

The length of Y and the number of rows of X must be equal.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

'LossFun' — Loss function'ClassifError' (default) | 'binodeviance' | 'exponential' | 'hinge' | function handle

Loss function, specified as the comma-separated pair consisting of 'LossFun' and a function handle or string.

  • The following lists available loss functions. Specify one using its corresponding string.

    ValueLoss Function
    'binodeviance'Binomial deviance
    'classiferror'Classification error
    'exponential'Exponential loss
    'hinge'Hinge loss

  • Specify your own function using function handle notation.

    Suppose that n = size(X,1) is the sample size and k = size(SVMModel.ClassNames,1) is the number of classes. Your function must have the signature lossvalue = lossfun(C,S,W,Cost), where:

    • The output argument lossvalue is a scalar.

    • You choose the function name (lossfun).

    • C is an n-by-k logical matrix with rows indicating which class the corresponding observation belongs. The column order corresponds to the class order in SVMModel.ClassNames.

      Construct C by setting C(p,q) = 1 if observation p is in class q, for each row. Set all other elements of row p to 0.

    • S is an n-by-k numeric matrix of classification scores. The column order corresponds to the class order in SVMModel.ClassNames. S is a matrix of classification scores, similar to the output of predict.

    • W is an n-by-1 numeric vector of observation weights. If you pass W, the software normalizes them to sum to 1.

    • Cost is a k-by-k numeric matrix of misclassification costs. For example, Cost = ones(K) - eye(K) specifies a cost of 0 for correct classification, and 1 for misclassification.

    Specify your function using 'LossFun',@lossfun.

'Weights' — Observation weightsones(size(X,1),1) (default) | numeric vector

Observation weights, specified as the comma-separated pair consisting of 'Weights' and a numeric vector.

The size of Weights must be equal to the number of rows of X. The software weighs the observations in each row of X with the corresponding weight in Weights.

If you do not specify your own loss function, then the software normalizes Weights to sum up to the value of the prior probability in the respective class.

Data Types: double

Output Arguments

expand all

L — Classification lossscalar

Classification loss, returned as a scalar. L is a generalization or resubstitution quality measure. Its interpretation depends on the loss function and weighting scheme, but, in general, better classifiers yield smaller loss values.

Definitions

Binomial Deviance

The binomial deviance is a binary classification error measure that has the form

L=j=1nwjlog(1+exp(2yjf(Xj)))j=1nwj,

where:

  • wj is weight j. The software renormalizes the weights to sum to 1.

  • yj = {-1,1}.

  • f(Xj) is the score for observation j.

The binomial deviance has connections to the maximization of the binomial likelihood function. For details on binomial deviance, see [1].

Classification Error

The classification error is a binary classification error measure that has the form

L=j=1nwjejj=1nwj,

where:

  • wj is the weight for observation j. The software renormalizes the weights to sum to 1.

  • ej = 1 if the predicted class of observation j differs from its true class, and 0 otherwise.

In other words, it is the proportion of observations that the classifier misclassifies.

Exponential Loss

A binary classification error measure that is similar to binomial deviance, and has the form

L=j=1nwjexp(yjf(Xj))j=1nwj,

where:

  • wj is weight j. The software renormalizes the weights to sum to 1.

  • yj = {-1,1}.

  • f(Xj) is the score for observation j.

Hinge Loss

Hinge loss is a binary classification error measure that has the form

L=j=1nwjmax{0,1yjf(Xj)}j=1nwj,

where:

  • wj is weight j. The software renormalizes the weights to sum to 1.

  • yj = {-1,1}.

  • f(Xj) is the score for observation j.

Hinge loss linearly penalizes for misclassified observations, and is related to the SVM objective function used for optimization. For more details on hinge loss, see [1].

Score

The SVM score for classifying observation x is the signed distance from x to the decision boundary ranging from -∞ to +∞. A positive score for a class indicates that x is predicted to be in that class, a negative score indicates otherwise.

The score is also the numerical, predicted response for x, f(x), computed by the trained SVM classification function

f(x)=j=1nαjyjG(xj,x)+b,

where (α1,...,αn,b) are the estimated SVM parameters, G(xj,x) is the dot product in the predictor space between x and the support vectors, and the sum includes the training set observations.

Examples

expand all

Determine the Test Sample Classification Error of SVM Classifiers

Load the ionosphere data set.

load ionosphere
rng(1); % For reproducibility

Train an SVM classifier. Specify a 15% holdout sample for testing. It is good practice to specify the class order and standardize the data.

CVSVMModel = fitcsvm(X,Y,'Holdout',0.15,'ClassNames',{'b','g'},...
    'Standardize',true);
CompactSVMModel = CVSVMModel.Trained{1}; % Extract the trained, compact classifier
testInds = test(CVSVMModel.Partition);   % Extract the test indices
XTest = X(testInds,:);
YTest = Y(testInds,:);

CVSVMModel is a ClassificationPartitionedModel classifier. It contains the property Trained, which is a 1-by-1 cell array holding a CompactClassificationSVM classifier that the software trained using the training set.

Determine how well the algorithm generalizes by estimating the test sample classification error.

L = loss(CompactSVMModel,XTest,YTest)
L =

    0.0787

The SVM classifier misclassifies approximately 8% of the test sample radar returns.

Determine the Test Sample Hinge Loss of SVM Classifiers

Load the ionosphere data set.

load ionosphere
rng(1); % For reproducibility

Train an SVM classifier. Specify a 15% holdout sample for testing. It is good practice to specify the class order and standardize the data.

CVSVMModel = fitcsvm(X,Y,'Holdout',0.15,'ClassNames',{'b','g'},...
    'Standardize',true);
CompactSVMModel = CVSVMModel.Trained{1}; % Extract the trained, compact classifier
testInds = test(CVSVMModel.Partition);   % Extract the test indices
XTest = X(testInds,:);
YTest = Y(testInds,:);

CVSVMModel is a ClassificationPartitionedModel classifier. It contains the property Trained, which is a 1-by-1 cell array holding a CompactClassificationSVM classifier that the software trained using the training set.

Determine how well the algorithm generalizes by estimating the test sample hinge loss.

L = loss(CompactSVMModel,XTest,YTest,'LossFun','Hinge')
L =

    0.2998

The hinge loss is approximately 0.3. Classifiers with hinge losses close to 0 are desirable.

References

[1] Hastie, T., R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, second edition. Springer, New York, 2008.

Was this topic helpful?