Examine Quality of KNN Classifier
This example shows how to examine the quality of a k-nearest neighbor classifier using resubstitution and cross validation.
Construct a KNN classifier for the Fisher iris data as in docid:stats_ug.btap7k2.
load fisheriris X = meas; Y = species; rng(10); % For reproducibility Mdl = fitcknn(X,Y,'NumNeighbors',4);
Examine the resubstitution loss, which, by default, is the fraction of misclassifications from the predictions of Mdl. (For nondefault cost, weights, or priors, see loss.).
rloss = resubLoss(Mdl)
rloss = 0.0400
The classifier predicts incorrectly for 4% of the training data.
Construct a cross-validated classifier from the model.
CVMdl = crossval(Mdl);
Examine the cross-validation loss, which is the average loss of each cross-validation model when predicting on data that is not used for training.
kloss = kfoldLoss(CVMdl)
kloss = 0.0333
The cross-validated classification accuracy resembles the resubstitution accuracy. Therefore, you can expect Mdl to misclassify approximately 4% of new data, assuming that the new data has about the same distribution as the training data.