Documentation

This is machine translation

Translated by Microsoft
Mouse over text to see original. Click the button below to return to the English verison of the page.

margin

Class: ClassificationKNN

Margin of k-nearest neighbor classifier

Syntax

  • m = margin(mdl,tbl,ResponseVarName)
  • m = margin(mdl,tbl,Y)
  • m = margin(mdl,X,Y)

Description

m = margin(mdl,tbl,ResponseVarName) returns the classification margins for the matrix of predictors X and class labels Y. For the definition, see Margin.

m = margin(mdl,tbl,Y) returns the classification margins for the matrix of predictors X and class labels Y.

m = margin(mdl,X,Y) returns the classification margins for the matrix of predictors X and class labels Y.

Input Arguments

expand all

k-nearest neighbor classifier model, returned as a classifier model object.

Note that using the 'CrossVal', 'KFold', 'Holdout', 'Leaveout', or 'CVPartition' options results in a model of class ClassificationPartitionedModel. You cannot use a partitioned tree for prediction, so this kind of tree does not have a predict method.

Otherwise, mdl is of class ClassificationKNN, and you can use the predict method to make predictions.

Sample data used to train the model, specified as a table. Each row of tbl corresponds to one observation, and each column corresponds to one predictor variable. Optionally, tbl can contain one additional column for the response variable. Multi-column variables and cell arrays other than cell arrays of character vectors are not allowed.

If tbl contains the response variable used to train mdl, then you do not need to specify ResponseVarName or Y.

If you trained mdl using sample data contained in a table, then the input data for this method must also be in a table.

Data Types: table

Response variable name, specified as the name of a variable in tbl. If tbl contains the response variable used to train mdl, then you do not need to specify ResponseVarName.

If you specify ResponseVarName, then you must do so as a character vector. For example, if the response variable is stored as tbl.response, then specify it as 'response'. Otherwise, the software treats all columns of tbl, including tbl.response, as predictors.

The response variable must be a categorical or character array, logical or numeric vector, or cell array of character vectors. If the response variable is a character array, then each element must correspond to one row of the array.

Matrix of predictor values. Each column of X represents one variable, and each row represents one observation.

A categorical array, cell array of character vectors, character array, logical vector, or a numeric vector with the same number of rows as X. Each row of Y represents the classification of the corresponding row of X.

Output Arguments

expand all

Numeric column vector of length size(X,1). Each entry in m represents the margin for the corresponding rows of X and (true class) Y, computed using mdl.

Definitions

Margin

The classification margin is the difference between the classification score for the true class and maximal classification score for the false classes.

Score

The score of a classification is the posterior probability of the classification. The posterior probability is the number of neighbors that have that classification, divided by the number of neighbors. For a more detailed definition that includes weights and prior probabilities, see Posterior Probability.

Examples

expand all

Construct a k-nearest neighbor classifier for the Fisher iris data, where k = 5.

Load the data.

load fisheriris

Construct a classifier for 5-nearest neighbors.

mdl = fitcknn(meas,species,'NumNeighbors',5);

Examine the margin of the classifier for a mean observation classified 'versicolor'.

X = mean(meas);
Y = {'versicolor'};
m = margin(mdl,X,Y)
m =

     1

The classifier has no doubt that 'versicolor' is the correct classification (all five nearest neighbors classify as 'versicolor').

Was this topic helpful?