Accelerating the pace of engineering and science

kfoldLoss

Classification loss for observations not used for training

Syntax

L = kfoldLoss(obj)
L = kfoldLoss(obj,Name,Value)

Description

L = kfoldLoss(obj) returns loss obtained by cross-validated classification model obj. For every fold, this method computes classification loss for in-fold observations using a model trained on out-of-fold observations.

L = kfoldLoss(obj,Name,Value) calculates loss with additional options specified by one or more Name,Value pair arguments. You can specify several name-value pair arguments in any order as Name1,Value1,…,NameN,ValueN.

Input Arguments

 obj Object of class ClassificationPartitionedModel.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

 'folds' Indices of folds ranging from 1 to obj.KFold. Use only these folds for predictions. Default: 1:obj.KFold 'lossfun ' Function handle or string representing a loss function. Built-in loss functions: 'binodeviance' — See Loss Functions.'classiferror' — Fraction of misclassified observations. See Loss Functions.'exponential' — See Loss Functions.'hinge' — See Loss Functions.'mincost' — Smallest misclassification cost as given by the obj.Cost matrix. See Loss Functions. You can write your own loss function in the syntax described in Loss Functions. Default: 'mincost' 'mode' A string for determining the output of kfoldLoss: 'average' — L is a scalar, the loss averaged over all folds.'individual' — L is a vector of length obj.KFold, where each entry is the loss for a fold. Default: 'average'

Output Arguments

 L Loss, by default the fraction of misclassified data. L can be a vector, and can mean different things, depending on the name-value pair settings.

Definitions

Classification Error

The default classification error is the fraction of the data X that obj misclassifies, where Y are the true classifications.

Weighted classification error is the sum of weight i times the Boolean value that is 1 when obj misclassifies the ith row of X, divided by the sum of the weights.

Loss Functions

The built-in loss functions are:

• 'binodeviance' — For binary classification, assume the classes yn are -1 and 1. With weight vector w normalized to have sum 1, and predictions of row n of data X as f(Xn), the binomial deviance is

$\sum {w}_{n}\mathrm{log}\left(1+\mathrm{exp}\left(-2{y}_{n}f\left({X}_{n}\right)\right)\right).$

• 'exponential' — With the same definitions as for 'binodeviance', the exponential loss is

$\sum {w}_{n}\mathrm{exp}\left(-{y}_{n}f\left({X}_{n}\right)\right).$

• 'classiferror' — Predict the label with the largest posterior probability. The loss is then the fraction of misclassified observations.

• 'hinge' — Classification error measure that has the form

$L=\frac{\sum _{j=1}^{n}{w}_{j}\mathrm{max}\left\{0,1-{y}_{j}\prime f\left({X}_{j}\right)\right\}}{\sum _{j=1}^{n}{w}_{j}},$

where:

• wj is weight j.

• For binary classification, yj = 1 for the positive class and -1 for the negative class. For problems where the number of classes K > 3, yj is a vector of 0s, but with a 1 in the position corresponding to the true class, e.g., if the second observation is in the third class and K = 4, then y2 = [0 0 1 0]′.

• $f\left({X}_{j}\right)$ is, for binary classification, the posterior probability or, for K > 3, a vector of posterior probabilities for each class given observation j.

• 'mincost' — Predict the label with the smallest expected misclassification cost, with expectation taken over the posterior probability, and cost as given by the Cost property of the classifier (a matrix). The loss is then the true misclassification cost averaged over the observations.

To write your own loss function, create a function file in this form:

`function loss = lossfun(C,S,W,COST)`
• N is the number of rows of X.

• K is the number of classes in the classifier, represented in the ClassNames property.

• C is an N-by-K logical matrix, with one true per row for the true class. The index for each class is its position in the ClassNames property.

• S is an N-by-K numeric matrix. S is a matrix of posterior probabilities for classes with one row per observation, similar to the posterior output from predict.

• W is a numeric vector with N elements, the observation weights. If you pass W, the elements are normalized to sum to the prior probabilities in the respective classes.

• COST is a K-by-K numeric matrix of misclassification costs. For example, you can use COST = ones(K) - eye(K), which means a cost of 0 for correct classification, and 1 for misclassification.

• The output loss should be a scalar.

Pass the function handle @lossfun as the value of the LossFun name-value pair.

Examples

Find the average cross-validated classification error for a model of the ionosphere data:

```load ionosphere
tree = fitctree(X,Y);
cvtree = crossval(tree);
L = kfoldLoss(cvtree)

L =
0.1197
```