MATLAB Examples

Examine Quality of KNN Classifier

This example shows how to examine the quality of a k-nearest neighbor classifier using resubstitution and cross validation.

Construct a KNN classifier for the Fisher iris data as in docid:stats_ug.btap7k2.

load fisheriris
X = meas;
Y = species;
rng(10); % For reproducibility
Mdl = fitcknn(X,Y,'NumNeighbors',4);

Examine the resubstitution loss, which, by default, is the fraction of misclassifications from the predictions of Mdl. (For nondefault cost, weights, or priors, see loss.).

rloss = resubLoss(Mdl)
rloss =

    0.0400

The classifier predicts incorrectly for 4% of the training data.

Construct a cross-validated classifier from the model.

CVMdl = crossval(Mdl);

Examine the cross-validation loss, which is the average loss of each cross-validation model when predicting on data that is not used for training.

kloss = kfoldLoss(CVMdl)
kloss =

    0.0333

The cross-validated classification accuracy resembles the resubstitution accuracy. Therefore, you can expect Mdl to misclassify approximately 4% of new data, assuming that the new data has about the same distribution as the training data.