Link two lists of points based on nearest neighbor.
NEARESTNEIGHBORLINKER link two lists of points based on nearest neighbor.
target_indices = NEARESTNEIGHBORLINKER(source, target) finds for each
point in 'source' the closest point in 'target'. These 2 inputs must be
arrays with one point per row, and have their cartesian coordinates in
each column (1D, 2D, 3D, ...). Nearest neighbor matching is based on
euclidean distance. The two arrays might not have the same number of
points.
The indices of the 'target' points are returned in an array
'target_indices', so that each row in 'source' matches the corresponding
row in 'target(target_indices, :)'.
The linking is exclusive: one source point is linked to at most one
target point, and conversely. The linking is only locally optimal: the
two closest points amongst the two sets are sought for first, then the
second closest pair, excluding the first, etc... This ensures that the
resulting linking will not depend on the order of the points in each set.
target_indices = NEARESTNEIGHBORLINKER(source, target, max_distance) adds
a condition on distance. If the nearest neighbor is found to be at a
distance larger than the given 'max_distance', they are not linked, and
the 'target_indices' receive the value -1 for this source point. The same
happens if all target points are exhausted.
[ target_indices target_distances ] =
NEARESTNEIGHBORLINKER(source, target)
additionaly return the distance to the matched target point. Un-matched
source points have a distance value set to NaN.
[ target_indices target_distances unmatched_targets ]=
NEARESTNEIGHBORLINKER(source, target)
additionaly return the indices of the points in 'target' that have not
been linked.
This is the cheapest (in term of accuracy) algorithm for linking that can
be made. In particular, it is not guaranteed (and it is generally not the
case) that the returned linking is an optimum for the sum of distances.
Each source point is matched regardless of the others, there is no global
optimization here (the Hungarian algorithm does that). Also, there exists
refinement to nearest neighbor searches, such as the use of KD-trees;
this contribution is exempt of such developments.
EXAMPLE:
n_points = 20;
source = 10 * rand(n_points, 2);
target = source + rand(n_points, 2);
target_indices = nearestneighborlinker(source, target);
colors = hsv(n_points);
figure
hold on
for i = 1 :n_points
plot(source(i,1), source(i,2), 'o', 'Color', colors(i,:))
plot(target(target_indices(i),1), target(target_indices(i),2), 's', ...
'Color', colors(i,:))
plot( [ source(i,1) target(target_indices(i),1) ] , ...
[ source(i,2) target(target_indices(i),2) ], ...
'Color', colors(i,:))
end
Jean-Yves Tinevez <jeanyves.tinevez@gmail.com>
1.1 | Made it a bit less stupid. Now links are created with first the smallest distance over the whole set. It makes the results independent of the order of the points. It also makes the code simpler actually. |
Inspired: Hungarian based particle linking, Simple Tracker
bw (view profile)
Thank you for this useful function.
However, I think there is a small bug. In the while loop where target indices are assigned to the source indices, I replaced
...
% No, then store this assignment target_indices( source_index ) = target_index;
target_distances ( source_index ) = sqrt( min_D ( sorted_index(i) ) );
...
by
...
if ~isinf(D(source_index,target_index))
% No, then store this assignment
target_indices( source_index ) = target_index;
target_distances ( source_index ) = sqrt ( min_D ( sorted_index(i) ) );
end
...
If I do not insert the if expression, I sometimes run into trouble when I specify max_distance. In this case when a source point has no neighbor within max_distance, all distances are set to Inf. Thus the min-function would return 1 as closest target. This gave rise to errors in certain situations. The above replacement solved this problem for me.