MATLAB Examples

Modify KNN Classifier

This example shows how to modify a k-nearest neighbor classifier.

Construct a KNN classifier for the Fisher iris data as in docid:stats_ug.btap7k2.

load fisheriris
X = meas;
Y = species;
Mdl = fitcknn(X,Y,'NumNeighbors',4);

Modify the model to use the three nearest neighbors, rather than the default one nearest neighbor.

Mdl.NumNeighbors = 3;

Compare the resubstitution predictions and cross-validation loss with the new number of neighbors.

loss = resubLoss(Mdl)

rng(10); % For reproducibility
CVMdl = crossval(Mdl,'KFold',5);
kloss = kfoldLoss(CVMdl)
loss =

    0.0400


kloss =

    0.0333

In this case, the model with three neighbors has the same cross-validated loss as the model with four neighbors (see docid:stats_ug.btap7l_).

Modify the model to use cosine distance instead of the default, and examine the loss. To use cosine distance, you must recreate the model using the exhaustive search method.

CMdl = fitcknn(X,Y,'NSMethod','exhaustive','Distance','cosine');
CMdl.NumNeighbors = 3;
closs = resubLoss(CMdl)
closs =

    0.0200

The classifier now has lower resubstitution error than before.

Check the quality of a cross-validated version of the new model.

CVCMdl = crossval(CMdl);
kcloss = kfoldLoss(CVCMdl)
kcloss =

    0.0200

CVCMdl has a better cross-validated loss than CVMdl. However, in general, improving the resubstitution error does not necessarily produce a model with better test-sample predictions.