ExhaustiveSearcher class

Superclasses: NeighborSearcher

Nearest neighbors search using exhaustive search

Description

An ExhaustiveSearcher object performs kNN (k-nearest neighbor) search using exhaustive search. Search objects store information about the data used, and the distance metric and parameters. The search performance for this object, compared with the KDTreeSearcher object, tends to be better for larger dimensions (10 or more) and worse for smaller dimensions. For details on search objects, see What Are Search Objects?.

Construction

NS = ExhaustiveSearcher(X,'Name',Value) constructs an ExhaustiveSearcher object based on X, where rows of X correspond to observations and columns correspond to variables, using one or more optional name-value pair arguments. You can then use this object to find neighbors in X nearest to the query points.

NS = createns(X,'NSMethod','exhaustive','Name',Value) creates an ExhaustiveSearcher object based on X using createns, where rows of X correspond to observations and columns correspond to variables, using one or more optional name-value pair arguments. You can use this object to find neighbors in X nearest to the query points.

Name-Value Pair Arguments

The ExhaustiveSearcher and the createns functions accept one or more of the following optional name-value pair arguments as input.

'Distance'

A string or function handle specifying the default distance metric used when you call the knnsearch method.

  • 'euclidean' — Euclidean distance (default).

  • 'seuclidean' — Standardized Euclidean distance. Each coordinate difference between rows in X and the query matrix is scaled by dividing by the corresponding element of the standard deviation computed from X, S=nanstd(X). To specify another value for S, use the Scale argument.

  • 'cityblock' — City block distance.

  • 'chebychev' — Chebychev distance (maximum coordinate difference).

  • 'minkowski' — Minkowski distance.

  • 'mahalanobis' — Mahalanobis distance, computed using a positive definite covariance matrix C. The default value of C is nancov(X). To change the value of C, use the Cov parameter.

  • 'cosine' — One minus the cosine of the included angle between observations (treated as vectors).

  • 'correlation' — One minus the sample linear correlation between observations (treated as sequences of values).

  • 'spearman' — One minus the sample Spearman's rank correlation between observations (treated as sequences of values).

  • 'hamming' — Hamming distance, percentage of coordinates that differ.

  • 'jaccard' — One minus the Jaccard coefficient, the percentage of nonzero coordinates that differ.

  • custom distance function — A distance function specified using @ (for example, @distfun). A custom distance function must

    • Have the form function D2 = distfun(ZI, ZJ)

    • Take as arguments:

      • A 1-by-n vector ZI containing a single row from X or from the query points Y

      • An m2-by-n matrix ZJ containing multiple rows of X or Y

    • Return an m2-by-1 vector of distances D2, whose jth element is the distance between the observations ZI and ZJ(j,:)

For details on these distance metrics, see Distance Metrics.

'P'

A positive scalar indicating the exponent of the Minkowski distance. This parameter is only valid when Distance is 'minkowski'. Default is 2.

'Cov'

A positive definite matrix indicating the covariance matrix when computing the Mahalanobis distance. This parameter is only valid when Distance is 'mahalanobis'. Default is nancov(X).

'Scale'

A vector S with the length equal to the number of columns in X. Each coordinate of X and each query point is scaled by the corresponding element of S when computing the standardized Euclidean distance. This parameter is only valid when Distance is 'seuclidean'. Default is nanstd(X).

Properties

X

A matrix used to create the object.

Distance

A string specifying a built-in distance metric or a function handle that you provide when you create the object. This property is the default distance metric used when you call the knnsearch method to find nearest neighbors for future query points.

DistParameter

Specifies the additional parameter for the chosen distance metric. The value is:

  • If 'Distance' is 'minkowski': A positive scalar indicating the exponent of the Minkowski distance.

  • If 'Distance' is 'mahalanobis': A positive definite matrix representing the covariance matrix used for computing the Mahalanobis distance.

  • If 'Distance' is 'seuclidean': A vector representing the scale value of the data when computing the 'seuclidean' distance.

  • Otherwise: Empty.

Methods

knnsearchFind k-nearest neighbors using ExhaustiveSearcher object
rangesearchFind all neighbors within specified distance using ExhaustiveSearcher object

Examples

expand all

Train a k-Nearest Neighbors Searcher

You can create an ExhaustiveSearcher model in two ways.

Load Fisher's iris data. Focus on the petal dimensions.

load fisheriris
X = meas(:,[3 4]); % Predictors

Prepare an exhaustive, nearest neighbors searcher using ExhaustiveSearcher and the predictors. Use the Minkowski distance metric.

ESMdl1 = ExhaustiveSearcher(X,'Distance','Minkowski')
ESMdl1 = 

  ExhaustiveSearcher with properties:

         Distance: 'minkowski'
    DistParameter: 2
                X: [150x2 double]

ESMdl1 is an ExhaustiveSearcher model. Access properties of ESMdl1 using dot notation. For example, use ESMdl1.DistParameter to access the Minkowski distance exponent.

Prepare an exhaustive, nearest neighbors searcher using createns and by specifying that the nearest neighbors search method is exhaustive.

ESMdl2 = createns(X,'NsMethod','exhaustive',...
   'Distance','minkowski')
ESMdl2 = 

  ExhaustiveSearcher with properties:

         Distance: 'minkowski'
    DistParameter: 2
                X: [150x2 double]

ESMdl2 is also an ExhaustiveSearcher model, and it is equivalent to ESMdl1.

You can pass new data and either of the models to:

For more in-depth examples using the knnsearch, see .

References

[1] Friedman, J. H., Bentely, J. and Finkel, R. A. (1977). An Algorithm for Finding Best Matches in Logarithmic Expected Time, ACM Transactions on Mathematical Software 3, 209.

Was this topic helpful?