Margin of k-nearest neighbor classifier by resubstitution
m is returned as a numeric vector of length
mdl.X is the training
mdl. Each entry in
the margin for the corresponding row of
mdl.X and the
corresponding true class label in
Create a k-nearest neighbor classifier for the Fisher iris data, where = 5.
Load the Fisher iris data set.
load fisheriris X = meas; Y = species;
Create a classifier for five nearest neighbors.
mdl = fitcknn(X,Y,'NumNeighbors',5);
Examine some statistics of the resubstitution margin of the classifier.
m = resubMargin(mdl); [max(m) min(m) mean(m)]
ans = 1×3 1.0000 -0.6000 0.9253
The mean margin is over 0.9, indicating fairly high classification accuracy for resubstitution. For a more reliable assessment of model accuracy, consider cross-validation, such as
The classification margin for each observation is the difference between the classification score for the true class and the maximal classification score for the false classes.
The score of a classification is the posterior probability of the classification. The posterior probability is the number of neighbors with that classification divided by the number of neighbors. For a more detailed definition that includes weights and prior probabilities, see Posterior Probability.