L = loss(obj,X,Y)
L = loss(obj,X,Y,Name,Value)
returns
a scalar representing how well L
= loss(obj
,X
,Y
)obj
classifies the
data in X
, when Y
contains the
true classifications.
When computing the loss, loss
normalizes the
class probabilities in Y
to the class probabilities
used for training, stored in the Prior
property
of obj
.
returns
the loss with additional options specified by one or more L
= loss(obj
,X
,Y
,Name,Value
)Name,Value
pair
arguments.

Discriminant analysis classifier of class 

Matrix where each row represents an observation, and each column
represents a predictor. The number of columns in 

Class labels, with the same data type as exists in 
Specify optional commaseparated pairs of Name,Value
arguments.
Name
is the argument
name and Value
is the corresponding
value. Name
must appear
inside single quotes (' '
).
You can specify several name and value pair
arguments in any order as Name1,Value1,...,NameN,ValueN
.

Function handle or string representing a loss function. Builtin loss functions:
You can write your own loss function using the syntax described in Loss Functions. Default: 

Numeric vector of length Default: 

Classification error, a scalar. The meaning of the error depends
on the values in 
The default classification error is the fraction of data X
that obj
misclassifies,
where Y
represents the true classifications.
Weighted classification error is the sum of weight i times
the Boolean value that is 1
when obj
misclassifies
the ith row of X
, divided by
the sum of the weights.
The builtin loss functions are:
'binodeviance'
— For binary
classification, assume the classes y_{n} are 1
and 1
.
With weight vector w normalized to have sum 1
,
and predictions of row n of data X as f(X_{n}),
the binomial deviance is
$$\sum {w}_{n}\mathrm{log}\left(1+\mathrm{exp}\left(2{y}_{n}f\left({X}_{n}\right)\right)\right)}.$$
'exponential'
— With the
same definitions as for 'binodeviance'
, the exponential
loss is
$$\sum {w}_{n}\mathrm{exp}\left({y}_{n}f\left({X}_{n}\right)\right)}.$$
'classiferror'
— Predict
the label with the largest posterior probability. The loss is then
the fraction of misclassified observations.
'hinge'
— Classification
error measure that has the form
$$L=\frac{{\displaystyle \sum}_{j=1}^{n}{w}_{j}\mathrm{max}\left\{0,1{y}_{j}\prime f\left({X}_{j}\right)\right\}}{{\displaystyle \sum}_{j=1}^{n}{w}_{j}},$$
where:
w_{j} is weight j.
For binary classification, y_{j} = 1 for the positive class and 1 for the negative class. For problems where the number of classes K > 3, y_{j} is a vector of 0s, but with a 1 in the position corresponding to the true class, e.g., if the second observation is in the third class and K = 4, then y_{2} = [0 0 1 0]′.
$$f({X}_{j})$$ is, for binary classification, the posterior probability or, for K > 3, a vector of posterior probabilities for each class given observation j.
'mincost'
— Predict the
label with the smallest expected misclassification cost, with expectation
taken over the posterior probability, and cost as given by the Cost
property
of the classifier (a matrix). The loss is then the true misclassification
cost averaged over the observations.
To write your own loss function, create a function file in this form:
function loss = lossfun(C,S,W,COST)
N
is the number of rows of X
.
K
is the number of classes in the
classifier, represented in the ClassNames
property.
C
is an N
byK
logical
matrix, with one true
per row for the true class.
The index for each class is its position in the ClassNames
property.
S
is an N
byK
numeric
matrix. S
is a matrix of posterior probabilities
for classes with one row per observation, similar to the posterior
output
from predict
.
W
is a numeric vector with N
elements,
the observation weights. If you pass W
, the elements
are normalized to sum to the prior probabilities in the respective
classes.
COST
is a K
byK
numeric
matrix of misclassification costs. For example, you can use COST = ones(K)  eye(K)
,
which means a cost of 0
for correct classification,
and 1
for misclassification.
The output loss
should be a scalar.
Pass the function handle @
as
the value of the lossfun
LossFun
namevalue pair.
The posterior probability that a point z belongs to class j is the product of the prior probability and the multivariate normal density. The density function of the multivariate normal with mean μ_{j} and covariance Σ_{j} at a point z is
$$P\left(xk\right)=\frac{1}{{\left(2\pi \left{\Sigma}_{k}\right\right)}^{1/2}}\mathrm{exp}\left(\frac{1}{2}{\left(x{\mu}_{k}\right)}^{T}{\Sigma}_{k}^{1}\left(x{\mu}_{k}\right)\right),$$
where $$\left{\Sigma}_{k}\right$$ is the determinant of Σ_{k}, and $${\Sigma}_{k}^{1}$$ is the inverse matrix.
Let P(k) represent the prior probability of class k. Then the posterior probability that an observation x is of class k is
$$\widehat{P}\left(kx\right)=\frac{P\left(xk\right)P\left(k\right)}{P\left(x\right)},$$
where P(x) is a normalization constant, the sum over k of P(xk)P(k).
The prior probability is one of three choices:
'uniform'
— The prior probability
of class k
is one over the total number of classes.
'empirical'
— The prior
probability of class k
is the number of training
samples of class k
divided by the total number
of training samples.
Custom — The prior probability of class k
is
the k
th element of the prior
vector.
See fitcdiscr
.
After creating a classifier obj
, you can
set the prior using dot notation:
obj.Prior = v;
where v
is a vector of positive elements
representing the frequency with which each element occurs. You do
not need to retrain the classifier when you set a new prior.
The matrix of expected costs per observation is defined in Cost.
Compute the resubstituted classification error for the Fisher iris data:
load fisheriris obj = fitcdiscr(meas,species); L = loss(obj,meas,species) L = 0.0200