Documentation Center |
k-nearest neighbor classification
A nearest-neighbor classification object, where both distance metric ("nearest") and number of neighbors can be altered. The object classifies new observations using the predict method. The object contains the data used for training, so can compute resubstitution predictions.
mdl = fitcknn(x,y) creates a k-nearest neighbor classification model. For details, see fitcknn.
mdl = fitcknn(x,y,Name,Value) creates a classifier with additional options specified by one or more Name,Value pair arguments. For details, see fitcknn.
BreakTies |
String specifying the method predict uses to break ties if multiple classes have the same smallest cost. By default, ties occur when multiple classes have the same number of nearest points among the K nearest neighbors.
'BreakTies' applies when 'IncludeTies' is false. Change BreakTies using dot notation: mdl.BreakTies = newBreakTies. | ||||||||||||||||||||||||||||||||
CategoricalPredictors |
Specification of which predictors are categorical.
| ||||||||||||||||||||||||||||||||
ClassNames |
List of elements in the training data Y with duplicates removed. ClassNames can be a numeric vector, vector of categorical variables, logical vector, character array, or cell array of strings. ClassNames has the same data type as the data in the argument Y. Change ClassNames using dot notation: mdl.ClassNames = newClassNames | ||||||||||||||||||||||||||||||||
Cost |
Square matrix, where Cost(i,j) is the cost of classifying a point into class j if its true class is i. Cost is K-by-K, where K is the number of classes. Change a Cost matrix using dot notation: mdl.Cost = costMatrix. | ||||||||||||||||||||||||||||||||
Distance |
String or function handle specifying the distance metric. The allowable strings depend on the NSMethod parameter, which you set in fitcknn, and which exists as a field in ModelParameters.
For definitions, see Distance Metrics. The distance metrics of ExhaustiveSearcher:
Change Distance using dot notation: mdl.Distance = newDistance. If NSMethod is kdtree, you can use dot notation to change Distance only among the types 'cityblock', 'chebychev', 'euclidean', or 'minkowski'. | ||||||||||||||||||||||||||||||||
DistanceWeight |
String or function handle specifying the distance weighting function.
Change DistanceWeight using dot notation: mdl.DistanceWeight = newDistanceWeight. | ||||||||||||||||||||||||||||||||
DistParameter |
Additional parameter for the distance metric.
For values of the distance metric other than those in the table, DistParameter must be []. Change DistParameter using dot notation: mdl.DistParameter = newDistParameter. | ||||||||||||||||||||||||||||||||
IncludeTies |
Logical value indicating whether predict includes all the neighbors whose distance values are equal to the Kth smallest distance. If IncludeTies is true, predict includes all these neighbors. Otherwise, predict uses exactly K neighbors (see 'BreakTies'). Change IncludeTies using dot notation: mdl.IncludeTies = newIncludeTies. | ||||||||||||||||||||||||||||||||
ModelParameters |
Parameters used in training mdl. | ||||||||||||||||||||||||||||||||
NumObservations |
Number of observations used in training mdl. This can be less than the number of rows in the training data, because data rows containing NaN values are not part of the fit. | ||||||||||||||||||||||||||||||||
NumNeighbors |
Positive integer specifying the number of nearest neighbors in X to find for classifying each point when predicting. Change NumNeighbors using dot notation: mdl.NumNeighbors = newNumNeighbors. | ||||||||||||||||||||||||||||||||
PredictorNames |
Cell array of names for the predictor variables, in the order in which they appear in the training data X. Change PredictorNames using dot notation: mdl.PredictorNames = newPredictorNames. | ||||||||||||||||||||||||||||||||
Prior |
Prior probabilities for each class. Prior is a numeric vector whose entries relate to the corresponding ClassNames property. Add or change a Prior vector using dot notation: obj.Prior = priorVector. | ||||||||||||||||||||||||||||||||
ResponseName |
String describing the response variable Y. Change ResponseName using dot notation: mdl.ResponseName = newResponseName. | ||||||||||||||||||||||||||||||||
W |
Numeric vector of nonnegative weights with the same number of rows as Y. Each entry in W specifies the relative importance of the corresponding observation in Y. Change W using dot notation: mdl.W = newW. | ||||||||||||||||||||||||||||||||
X |
Numeric matrix of predictor values. Each column of X represents one predictor (variable), and each row represents one observation. | ||||||||||||||||||||||||||||||||
Y |
A numeric vector, vector of categorical variables, logical vector, character array, or cell array of strings, with the same number of rows as X. Y is of the same type as the passed-in Y data. |
crossval | Cross-validated k-nearest neighbor classifier |
edge | Edge of k-nearest neighbor classifier |
loss | Loss of k-nearest neighbor classifier |
margin | Margin of k-nearest neighbor classifier |
predict | Predict k-nearest neighbor classification |
resubEdge | Edge of k-nearest neighbor classifier by resubstitution |
resubLoss | Loss of k-nearest neighbor classifier by resubstitution |
resubMargin | Margin of k-nearest neighbor classifier by resubstitution |
resubPredict | Predict resubstitution response of k-nearest neighbor classifier |
ClassificationKNN predicts the classification of a point Xnew using a procedure equivalent to this:
Find the NumNeighbors points in the training set X that are nearest to Xnew.
Find the NumNeighbors response values Y to those nearest points.
Assign the classification label Ynew that has smallest expected misclassification cost among the values in Y.
For details, see Posterior Probability and Expected Cost in the predict documentation.
Value. To learn how value classes affect copy operations, see Copying Objects in the MATLAB^{®} documentation.
knnsearch finds the k-nearest neighbors of points. rangesearch finds all the points within a fixed distance. You can use these functions for classification, as shown in Classifying Query Data Using knnsearch. If you want to perform classification, ClassificationKNN can be more convenient, in that you can construct a classifier in one step and classify in other steps. Also, ClassificationKNN has cross-validation options.