Modify KNN Classifier
This example shows how to modify a k-nearest neighbor classifier.
Construct a KNN classifier for the Fisher iris data as in docid:stats_ug.btap7k2.
load fisheriris X = meas; Y = species; Mdl = fitcknn(X,Y,'NumNeighbors',4);
Modify the model to use the three nearest neighbors, rather than the default one nearest neighbor.
Mdl.NumNeighbors = 3;
Compare the resubstitution predictions and cross-validation loss with the new number of neighbors.
loss = resubLoss(Mdl) rng(10); % For reproducibility CVMdl = crossval(Mdl,'KFold',5); kloss = kfoldLoss(CVMdl)
loss = 0.0400 kloss = 0.0333
In this case, the model with three neighbors has the same cross-validated loss as the model with four neighbors (see docid:stats_ug.btap7l_).
Modify the model to use cosine distance instead of the default, and examine the loss. To use cosine distance, you must recreate the model using the exhaustive search method.
CMdl = fitcknn(X,Y,'NSMethod','exhaustive','Distance','cosine'); CMdl.NumNeighbors = 3; closs = resubLoss(CMdl)
closs = 0.0200
The classifier now has lower resubstitution error than before.
Check the quality of a cross-validated version of the new model.
CVCMdl = crossval(CMdl); kcloss = kfoldLoss(CVCMdl)
kcloss = 0.0200
CVCMdl has a better cross-validated loss than CVMdl. However, in general, improving the resubstitution error does not necessarily produce a model with better test-sample predictions.