Classification loss for Gaussian kernel classification model
returns the classification
loss for the binary Gaussian kernel classification model
L
= loss(Mdl
,X
,Y
)Mdl
using the predictor data in X
and the
corresponding class labels in Y
.
Load the ionosphere
data set. This data set has 34 predictors and 351 binary responses for radar returns, either bad ('b'
) or good ('g'
).
load ionosphere
Partition the data set into training and test sets. Specify a 15% holdout sample for the test set.
rng('default') % For reproducibility Partition = cvpartition(Y,'Holdout',0.15); trainingInds = training(Partition); % Indices for the training set testInds = test(Partition); % Indices for the test set
Train a binary kernel classification model using the training set.
Mdl = fitckernel(X(trainingInds,:),Y(trainingInds));
Estimate the training-set classification error and the test-set classification error.
ceTrain = loss(Mdl,X(trainingInds,:),Y(trainingInds))
ceTrain = 0.0067
ceTest = loss(Mdl,X(testInds,:),Y(testInds))
ceTest = 0.1140
Load the ionosphere
data set. This data set has 34 predictors and 351 binary responses for radar returns, either bad ('b'
) or good ('g'
).
load ionosphere
Partition the data set into training and test sets. Specify a 15% holdout sample for the test set.
rng('default') % For reproducibility Partition = cvpartition(Y,'Holdout',0.15); trainingInds = training(Partition); % Indices for the training set testInds = test(Partition); % Indices for the test set
Train a binary kernel classification model using the training set.
Mdl = fitckernel(X(trainingInds,:),Y(trainingInds));
Create an anonymous function that measures linear loss, that is,
$$L=\frac{\sum _{j}-{w}_{j}{y}_{j}{f}_{j}}{\sum _{j}{w}_{j}}.$$
$${w}_{j}$$ is the weight for observation j, $${y}_{j}$$ is response j (-1 for the negative class, and 1 otherwise), and $${f}_{j}$$ is the raw classification score of observation j.
linearloss = @(C,S,W,Cost)sum(-W.*sum(S.*C,2))/sum(W);
Custom loss functions must be written in a particular form. For rules on writing a custom loss function, see the 'LossFun'
name-value pair argument.
Estimate the training-set classification loss and the test-set classification loss using the linear loss function.
ceTrain = loss(Mdl,X(trainingInds,:),Y(trainingInds),'LossFun',linearloss)
ceTrain = -1.0851
ceTest = loss(Mdl,X(testInds,:),Y(testInds),'LossFun',linearloss)
ceTest = -0.7821
Mdl
— Binary kernel classification modelClassificationKernel
model objectBinary kernel classification model, specified as a ClassificationKernel
model object. You can create a
ClassificationKernel
model object using fitckernel
.
Y
— Class labelsClass labels, specified as a categorical, character, or string array, logical or numeric vector, or cell array of character vectors.
The data type of Y
must be the same as the
data type of Mdl.ClassNames
. (The software treats string arrays as cell arrays of character
vectors.)
The distinct classes in Y
must
be a subset of Mdl.ClassNames
.
If Y
is a character array, then
each element must correspond to one row of the array.
The length of Y
and the number
of observations in X
must be equal.
Data Types: categorical
| char
| string
| logical
| single
| double
| cell
Specify optional
comma-separated pairs of Name,Value
arguments. Name
is
the argument name and Value
is the corresponding value.
Name
must appear inside quotes. You can specify several name and value
pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
L =
loss(Mdl,X,Y,'LossFun','quadratic','Weights',weights)
returns the
weighted classification loss using the quadratic loss function.'LossFun'
— Loss function'classiferror'
(default) | 'binodeviance'
| 'exponential'
| 'hinge'
| 'logit'
| 'mincost'
| 'quadratic'
| function handleLoss function, specified as the comma-separated pair consisting of
'LossFun'
and a built-in loss function name or a
function handle.
This table lists the available loss functions. Specify one using its corresponding value.
Value | Description |
---|---|
'binodeviance' | Binomial deviance |
'classiferror' | Classification error |
'exponential' | Exponential |
'hinge' | Hinge |
'logit' | Logistic |
'mincost' | Minimal expected misclassification cost (for classification scores that are posterior probabilities) |
'quadratic' | Quadratic |
'mincost'
is appropriate for
classification scores that are posterior probabilities. For
kernel classification models, logistic regression learners
return posterior probabilities as classification scores by
default, but SVM learners do not (see predict
).
Specify your own function by using function handle notation.
Let n
be the number of observations in X
and
K
be the number of distinct
classes (numel(Mdl.ClassNames)
, where
Mdl
is the input model). Your
function must have this signature:
lossvalue = lossfun
(C,S,W,Cost)
The output argument
lossvalue
is a scalar.
You choose the function name
(lossfun
).
C
is an
n
-by-K
logical matrix with rows indicating the class to
which the corresponding observation belongs. The
column order corresponds to the class order in
Mdl.ClassNames
.
Construct C
by setting
C(p,q) = 1
, if observation
p
is in class
q
, for each row. Set all other
elements of row p
to
0
.
S
is an
n
-by-K
numeric matrix of classification scores. The
column order corresponds to the class order in
Mdl.ClassNames
.
S
is a matrix of classification
scores, similar to the output of predict
.
W
is an
n
-by-1 numeric vector of
observation weights. If you pass
W
, the software normalizes the
weights to sum to 1
.
Cost
is a
K
-by-K
numeric matrix of misclassification costs. For
example, Cost = ones(K) –
eye(K)
specifies a cost of
0
for correct classification,
and 1
for
misclassification.
Example: 'LossFun',@
lossfun
Data Types: char
| string
| function_handle
'Weights'
— Observation weightsObservation weights, specified as the comma-separated pair consisting
of 'Weights'
and a positive numeric vector of length
n
, where n
is
the number of observations in X
. If you supply
weights, loss
computes the weighted classification loss.
The default value is
ones(
.n
,1)
loss
normalizes weights to sum up to the
value of the prior probability in the respective class.
Data Types: double
| single
L
— Classification lossClassification loss, returned as a numeric scalar. The
interpretation of L
depends on
Weights
and LossFun
.
Classification loss functions measure the predictive inaccuracy of classification models. When you compare the same type of loss among many models, a lower loss indicates a better predictive model.
Suppose the following:
L is the weighted average classification loss.
n is the sample size.
y_{j} is the observed class label. The software codes it as –1 or 1, indicating the negative or positive class, respectively.
f(X_{j}) is the raw classification score for the transformed observation (row) j of the predictor data X using feature expansion.
m_{j} = y_{j}f(X_{j}) is the classification score for classifying observation j into the class corresponding to y_{j}. Positive values of m_{j} indicate correct classification and do not contribute much to the average loss. Negative values of m_{j} indicate incorrect classification and contribute to the average loss.
The weight for observation j is w_{j}. The software normalizes the observation weights so that they sum to the corresponding prior class probability. The software also normalizes the prior probabilities so that they sum to 1. Therefore,
$$\sum _{j=1}^{n}{w}_{j}}=1.$$
This table describes the supported loss functions that you can specify by using the
'LossFun'
name-value pair argument.
Loss Function | Value of LossFun | Equation |
---|---|---|
Binomial deviance | 'binodeviance' | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{log}\left\{1+\mathrm{exp}\left[-2{m}_{j}\right]\right\}}.$$ |
Exponential loss | 'exponential' | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{exp}\left(-{m}_{j}\right)}.$$ |
Classification error | 'classiferror' | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}}I\left\{{\widehat{y}}_{j}\ne {y}_{j}\right\}.$$ The classification error is the weighted fraction of misclassified observations where $${\widehat{y}}_{j}$$ is the class label corresponding to the class with the maximal posterior probability. I{x} is the indicator function. |
Hinge loss | 'hinge' | $$L={\displaystyle \sum}_{j=1}^{n}{w}_{j}\mathrm{max}\left\{0,1-{m}_{j}\right\}.$$ |
Logit loss | 'logit' | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{log}\left(1+\mathrm{exp}\left(-{m}_{j}\right)\right)}.$$ |
Minimal cost | 'mincost' | The software computes the weighted minimal cost using this procedure for observations j = 1,...,n.
The weighted, average, minimum cost loss is $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}{c}_{j}}.$$ |
Quadratic loss | 'quadratic' | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}{\left(1-{m}_{j}\right)}^{2}}.$$ |
This figure compares the loss functions (except minimal cost) for one observation over m. Some functions are normalized to pass through [0,1].
This function fully supports tall arrays. For more information, see Tall Arrays (MATLAB).
A modified version of this example exists on your system. Do you want to open this version instead?
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
Select web siteYou can also select a web site from the following list:
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.