`label = predict(Mdl,X)`

`[label,score,cost] = predict(Mdl,X)`

returns
a vector of predicted
class label for the predictor data in the table or matrix `label`

= predict(`Mdl`

,`X`

)`X`

,
based on the trained *k*-nearest neighbor classification
model `Mdl`

.

`[`

also
returns:`label`

,`score`

,`cost`

]
= predict(`Mdl`

,`X`

)

A matrix of classification scores (

`score`

) indicating the likelihood that a label comes from a particular class. For*k*-nearest neighbor, scores are posterior probabilities.A matrix of expected classification cost (

`cost`

). For each observation in`X`

, the predicted class label corresponds to the minimum expected classification costs among all classes.

`predict`

classifies so as to minimize the expected
classification cost:

$$\widehat{y}=\underset{y=1,\mathrm{...},K}{\mathrm{arg}\mathrm{min}}{\displaystyle \sum _{k=1}^{K}\widehat{P}\left(k|x\right)C\left(y|k\right)},$$

where

$$\widehat{y}$$ is the predicted classification.

*K*is the number of classes.$$\widehat{P}\left(k|x\right)$$ is the posterior probability of class

*k*for observation*x*.$$C\left(y|k\right)$$ is the cost of classifying an observation as

*y*when its true class is*k*.

For a vector (single query point) `Xnew`

and
model `mdl`

, let:

`K`

be the number of nearest neighbors used in prediction,`mdl.NumNeighbors`

`nbd(mdl,Xnew)`

be the`K`

nearest neighbors to`Xnew`

in`mdl.X`

`Y(nbd)`

be the classifications of the points in`nbd(mdl,Xnew)`

, namely`mdl.Y(nbd)`

`W(nbd)`

be the weights of the points in`nbd(mdl,Xnew)`

`prior`

be the priors of the classes in`mdl.Y`

If there is a vector of prior probabilities, then the observation
weights `W`

are normalized by class to sum to the
priors. This might involve a calculation for the point `Xnew`

,
because weights can depend on the distance from `Xnew`

to
the points in `mdl.X`

.

The posterior probability *p*(*j*|`Xnew`

)
is

$$p\left(j|\text{Xnew}\right)=\frac{{\displaystyle \sum _{i\in \text{nbd}}W(i){1}_{Y(X(i)=j)}}}{{\displaystyle \sum _{i\in \text{nbd}}W(i)}}.$$

Here, $${1}_{Y(X(i)=j)}$$ means `1`

when `mdl.Y(i) = j`

, and `0`

otherwise.

There are two costs associated with KNN classification: the true misclassification cost per class, and the expected misclassification cost per observation.

You can set the true misclassification cost per class in the `Cost`

name-value
pair when you run `fitcknn`

. `Cost(i,j)`

is
the cost of classifying an observation into class `j`

if
its true class is `i`

. By default, `Cost(i,j)=1`

if `i~=j`

,
and `Cost(i,j)=0`

if `i=j`

. In other
words, the cost is `0`

for correct classification,
and `1`

for incorrect classification.

There are two costs associated with KNN classification: the
true misclassification cost per class, and the expected misclassification
cost per observation. The third output of `predict`

is
the expected misclassification cost per observation.

Suppose you have `Nobs`

observations that you
want to classify with a trained classifier `mdl`

.
Suppose you have `K`

classes. You place the observations
into a matrix `Xnew`

with one observation per row.
The command

[label,score,cost] = predict(mdl,Xnew)

returns, among other outputs, a `cost`

matrix
of size `Nobs`

-by-`K`

. Each row
of the `cost`

matrix contains the expected (average)
cost of classifying the observation into each of the `K`

classes. `cost(n,k)`

is

$$\sum _{i=1}^{K}\widehat{P}\left(i|Xnew(n)\right)C\left(k|i\right)},$$

where

*K*is the number of classes.$$\widehat{P}\left(i|Xnew(n)\right)$$ is the posterior probability of class

*i*for observation*Xnew*(*n*).$$C\left(k|i\right)$$ is the true misclassification cost of classifying an observation as

*k*when its true class is*i*.

Was this topic helpful?