L = loss(obj,X,Y)
L = loss(obj,X,Y,Name,Value)
When computing the loss,
loss normalizes the
class probabilities in
Y to the class probabilities
used for training, stored in the
Matrix where each row represents an observation, and each column
represents a predictor. The number of columns in
Class labels, with the same data type as exists in
Specify optional comma-separated pairs of
Name is the argument
Value is the corresponding
Name must appear
inside single quotes (
You can specify several name and value pair
arguments in any order as
Function handle or string representing a loss function. Built-in loss functions:
You can write your own loss function using the syntax described in Loss Functions.
Numeric vector of length
Classification error, a scalar. The meaning of the error depends
on the values in
The default classification error is the fraction of data
Y represents the true classifications.
Weighted classification error is the sum of weight i times
the Boolean value that is
the ith row of
X, divided by
the sum of the weights.
The built-in loss functions are:
'binodeviance' — For binary
classification, assume the classes yn are
With weight vector w normalized to have sum
and predictions of row n of data X as f(Xn),
the binomial deviance is
'exponential' — With the
same definitions as for
'binodeviance', the exponential
'classiferror' — Predict
the label with the largest posterior probability. The loss is then
the fraction of misclassified observations.
'hinge' — Classification
error measure that has the form
wj is weight j.
For binary classification, yj = 1 for the positive class and -1 for the negative class. For problems where the number of classes K > 3, yj is a vector of 0s, but with a 1 in the position corresponding to the true class, e.g., if the second observation is in the third class and K = 4, then y2 = [0 0 1 0]′.
is, for binary classification, the posterior probability or, for K > 3, a vector of posterior probabilities for each class given observation j.
'mincost' — Predict the
label with the smallest expected misclassification cost, with expectation
taken over the posterior probability, and cost as given by the
of the classifier (a matrix). The loss is then the true misclassification
cost averaged over the observations.
To write your own loss function, create a function file in this form:
function loss = lossfun(C,S,W,COST)
N is the number of rows of
K is the number of classes in the
classifier, represented in the
C is an
matrix, with one
true per row for the true class.
The index for each class is its position in the
S is an
S is a matrix of posterior probabilities
for classes with one row per observation, similar to the
W is a numeric vector with
the observation weights. If you pass
W, the elements
are normalized to sum to the prior probabilities in the respective
COST is a
matrix of misclassification costs. For example, you can use
COST = ones(K) - eye(K),
which means a cost of
0 for correct classification,
1 for misclassification.
loss should be a scalar.
Pass the function handle
the value of the
LossFun name-value pair.
The posterior probability that a point z belongs to class j is the product of the prior probability and the multivariate normal density. The density function of the multivariate normal with mean μj and covariance Σj at a point z is
where is the determinant of Σk, and is the inverse matrix.
Let P(k) represent the prior probability of class k. Then the posterior probability that an observation x is of class k is
where P(x) is a normalization constant, the sum over k of P(x|k)P(k).
The prior probability is one of three choices:
'uniform' — The prior probability
k is one over the total number of classes.
'empirical' — The prior
probability of class
k is the number of training
samples of class
k divided by the total number
of training samples.
Custom — The prior probability of class
kth element of the
After creating a classifier
obj, you can
set the prior using dot notation:
obj.Prior = v;
v is a vector of positive elements
representing the frequency with which each element occurs. You do
not need to retrain the classifier when you set a new prior.
The matrix of expected costs per observation is defined in Cost.
Compute the resubstituted classification error for the Fisher iris data:
load fisheriris obj = fitcdiscr(meas,species); L = loss(obj,meas,species) L = 0.0200