Margin of k-nearest neighbor classifier by resubstitution
m = resubMargin(mdl)
k-nearest neighbor classifier, created by ClassificationKNN.fit.
A numeric column vector of length size(mdl.X,1), where mdl.X is the training data for mdl. Each entry in m represents the margin for the corresponding row of mdl.X and (true class) mdl.Y.
The classification margin is the difference between the classification score for the true class and maximal classification score for the false classes.
Margin is a column vector with the same number of rows as in the training data.
The score of a classification is the posterior probability of the classification. The posterior probability is the number of neighbors that have that classification, divided by the number of neighbors. For a more detailed definition that includes weights and prior probabilities, see Posterior Probability.
Construct a k-nearest neighbor classifier for the Fisher iris data, where k = 5.
Load the data.
load fisheriris X = meas; Y = species;
Construct a classifier for 5-nearest neighbors.
mdl = ClassificationKNN.fit(X,Y,'NumNeighbors',5);
Examine some statistics of the resubstitution margin of the classifier.
m = resubMargin(mdl); [max(m) min(m) mean(m)]
ans = 1.0000 -0.6000 0.9253
The mean margin is over 0.9, indicating fairly high classification accuracy for resubstitution. For more reliable assessment of model accuracy, consider cross validation, such as kfoldLoss.