rangesearch

Find all neighbors within specified distance

Syntax

idx = rangesearch(X,Y,r)
[idx,D]= rangesearch(X,Y,r)
[idx,D]= rangesearch(X,Y,r,Name,Value)

Description

idx = rangesearch(X,Y,r) finds all the X points that are within distance r of the Y points. Rows of X and Y correspond to observations, and columns correspond to variables.

[idx,D]= rangesearch(X,Y,r) returns the distances between each row of Y and the rows of X that are r or less distant.

[idx,D]= rangesearch(X,Y,r,Name,Value) finds nearby points with additional options specified by one or more Name,Value pair arguments.

Input Arguments

X

mx-by-n numeric matrix, where each row represents one n-dimensional point. The number of columns n must equal as the number of columns in Y.

Y

my-by-n numeric matrix, where each row represents one n-dimensional point. The number of columns n must equal as the number of columns in X.

r

Search radius, a scalar. rangesearch finds all X points (rows) that are within distance r of each Y point. The meaning of distance depends on the Distance name-value pair.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

'BucketSize'

Maximum number of data points in the leaf node of the kd-tree. This argument is only meaningful when kd-tree is used for finding nearest neighbors.

Default: 50

'Cov'

Positive definite matrix indicating the covariance matrix when computing the Mahalanobis distance. This argument is only valid when the Distance name-value pair argument is 'mahalanobis'.

Default: nancov(X)

'Distance'

String or function handle specifying the distance metric.

ValueDescription
'euclidean'Euclidean distance.
'seuclidean'Standardized Euclidean distance. Each coordinate difference between X and a query point is scaled, meaning divided by a scale value S. The default value of S is the standard deviation computed from X, S = nanstd(X). To specify another value for S, use the Scale name-value pair argument.
'mahalanobis'Mahalanobis distance, computed using a positive definite covariance matrix C. The default value of C is the sample covariance matrix of X, as computed by nancov(X). To specify a different value for C, use the 'Cov' name-value pair argument.
'cityblock'City block distance.
'minkowski'Minkowski distance. The default exponent is 2. To specify a different exponent, use the 'P' name-value pair argument.
'chebychev'Chebychev distance (maximum coordinate difference).
'cosine'One minus the cosine of the included angle between observations (treated as vectors).
'correlation'One minus the sample linear correlation between observations (treated as sequences of values).
'hamming'Hamming distance, the percentage of coordinates that differ.
'jaccard'One minus the Jaccard coefficient, the percentage of nonzero coordinates that differ.
'spearman'One minus the sample Spearman's rank correlation between observations (treated as sequences of values).
@distfunDistance function handle. distfun has the form
function D2 = DISTFUN(ZI,ZJ)
...
where
  • ZI is a 1-by-N vector containing one row of X or Y.

  • ZJ is an M2-by-N matrix containing multiple rows of X or Y.

  • D2 is an M2-by-1 vector of distances, and D2(k) is the distance between the observations ZI and ZJ(J,:).

For definitions, see Distance Metrics.

Default: NS.Distance

'NSMethod'

Nearest neighbors search method.

ValueMeaning
'kdtree'Creates and uses a kd-tree to find nearest neighbors. 'kdtree' is only valid when the distance metric is one of:
  • 'chebyshev'

  • 'cityblock'

  • 'euclidean'

  • 'minkowski'

'exhaustive'Uses the exhaustive search algorithm. The distances from all X points to each Y point are computed to find nearest neighbors.

Default: 'kdtree' when the number of columns of X is not greater than 10, X is not sparse, and the distance metric is one of the valid 'kdtree' metrics. Otherwise, the default is 'exhaustive'.

'P'

Positive scalar indicating the exponent of Minkowski distance. This argument is only valid when the Distance name-value pair argument is 'minkowski'.

Default: 2

'Scale'

Vector S containing nonnegative values, with length equal to the number of columns in X. Each coordinate difference between X and a query point is scaled by the corresponding element of S. This argument is only valid when the Distance name-value pair argument is 'seuclidean'.

Default: nanstd(X)

Output Arguments

idx

my-by-1 cell array, where my is the number of rows in Y. idx{I} contains the indices of points (rows) in X whose distances to Y(I,:) are not greater than r. The entries in idx{I} are in ascending order of distance.

D

my-by-1 cell array, where my is the number of rows in Y. D{I} contains the distance values between Y(I,:) and the corresponding points in idx{I}.

Examples

Find the X points that are within a Euclidean distance 1.5 of each Y point. Both X and Y are samples of 5-D normally distributed variables.

rng('default') % for reproducibility
X = randn(100,5);
Y = randn(10,5);
[idx, dist] = rangesearch(X,Y,1.5)

idx = 
    [1x7  double]
    [1x2  double]
    [1x11 double]
    [1x2  double]
    [1x12 double]
    [1x9  double]
    [         89]
    [1x0  double]
    [1x0  double]
    [1x0  double]

dist = 
    [1x7  double]
    [1x2  double]
    [1x11 double]
    [1x2  double]
    [1x12 double]
    [1x9  double]
    [     1.1739]
    [1x0  double]
    [1x0  double]
    [1x0  double]

In this case, the last three Y points are more than 1.5 distant from any X point. X(89,:) is 1.1739 distant from Y(7,:), and there is no other X point that is within distance 1.5 of Y(7,:). There are 12 points in X within distance 1.5 of Y(5,:).

Alternatives

rangesearch is the ExhaustiveSearcher method for distance search. It is equivalent to the rangesearch function with the NSMethod name-value pair set to 'exhaustive'.

rangesearch is the KDTreeSearcher method for distance search. It is equivalent to the rangesearch function with the NSMethod name-value pair set to 'kdtree'.

More About

expand all

Distance Metrics

For definitions, see Distance Metrics.

Tips

  • For a fixed positive integer K, knnsearch finds K points in X that are nearest each Y point. In contrast, for a fixed positive real value r, rangesearch finds all the X points that are within a distance r of each Y point.

Algorithms

For an overview of the kd-tree algorithm, see k-Nearest Neighbor Search Using a kd-Tree.

The exhaustive search algorithm finds the distance of each point in X to each point in Y.

Was this topic helpful?