The discrete Frechet distance is a scalar measure of similarity between two curves.
The Frechet distance is a measure of similarity between two curves, P and Q. It is defined as the minimum cordlength sufficient to join a point traveling forward along P and one traveling forward along Q, although the rate of travel for either point may not necessarily be uniform.
This algorithm calculates a bounded approximation of the Frechet distance using sampled points along curves P and Q.
1.4  Fixed call to nargin that should have been nargout 

1.2  Added a new optional output which returns a valid coupling sequence at the suggestion of Lingji. 

1.1  Removed extraneous inner function inputs for efficiency. 
ZEJUN ZUO (view profile)
I need to use it calculate the similarity of lines.
Zachary Danziger (view profile)
@qinghui It is difficult to know what you are asking. I assume by “starting point” you mean the point listed first in the input vector. For all curves, closed or otherwise, the order of the points in the input determines its directionality from the algorithm’s perspective, and will constrain the algorithm to begin its search with the first points in the inputs for both curves. Even identical curves sampled at different points will yield a nonzero result, so, to obtain a zerodistance result, both curves must have the same ordering of points.
qinghui lu (view profile)
Zachary,
Thank you for your code.I have a question.For two closed curves,if they must have the same starting point??
Zachary Danziger (view profile)
@Xiao You are correct, the Frechet Distance will evaluate the two curves in your example as being equal. The erroneous result produced by this code stems from the fact that, for any two curves, it is in general not possible to compute the FD analytically. This algorithm computes the discrete FD, which is an approximation of the FD that operates on line segments rather than continuous functions. Your example is essentially trying to compute the FD of severely undersampled curves, which will result in high error. You should sample the curves as frequently as possible to reduce error, which will solve your problem. It is not necessary for both curves to have equal sampling rates, especially if they are both sampled at sufficiently high rates to begin with.
Xiao Wang (view profile)
Hello. I have a question. Let's say two lines.a=1:5; b=[1 2 4 5];
P=[a' a'] and Q =[b' b']. Actually these two are the same line right. However, if I calculate the Frechet distance, I will have sqrt(2) but not zero. Should this be right or not? And if this is right, when I calculate Frechet distance of two curves with different number of points, should I interpolate to make them have same number of points? Thanks
Ronan CIMADURE (view profile)
Zachary Danziger (view profile)
Leyon,
The Frechet Distance, which this code approximates, is only defined for 2 curves. However, I am sure that some one cleverer than I could think of a way...
Leyon (view profile)
Is it possible to use this script to sample more than 2 curves?
Zachary Danziger (view profile)
Arthur,
If you are just looking for the standard Euclidean distance you can leave off the dfcn argument and it will be used by default.
Otherwise it expects a function handle. The 'ordinary' distance function is passed as:
dfcn = @(p,q) sqrt(sum( (pq).^2 ));
But any function will work that operates on points p and q that respects their dimensionality. A Chebyshevlike measure could be used, for example:
dcheb = @(p,q) max(abs(pq))
Arthur Allen (view profile)
Zach and Lingi; I would like to compare two drifter tracks (lat, long pairs). I tried using 'distance' from the Mapping Toolbox, but this dosn't work. Can you provide an example of dfcn function that does work? Thanks,
Zachary Danziger (view profile)
Thanks for all your help Lingji, it's certainly a better tool now.
Lingji Chen (view profile)
Zachary,
I hope you don't mind my posting this modified version; if you like it, feel free to use it to update your package or do whatever you see fit.
function [dFrechet, coupling] = fcn_discreteFrechetDistance(P, Q, dist_fcn)
% P and Q: two curves represented by matrices of size dim by number
% dist_fcn: optional distance function handle; default to Eucleandian
[n1, p] = size(P);
[n2, q] = size(Q);
if (n1 ~= n2)
error('P and Q mismatch');
end
switch nargin
case 2
dist_fcn = @(x, y) sqrt(sum((x  y).^2));
case 3
otherwise
error('wrong number of arguments');
end
% This is worked on by the inner function
CA = ones(p, q);
% This is also worked on by the inner functions.
% Each cell stores four numbers [[u_last; v_last], [u_new; v_new]].
% The first column is an index into the cell array, and the second column
% is the newly added edge
coupling_cell = cell(p, q);
get_c(p, q);
dFrechet = CA(p, q);
coupling = get_coupling(p, q);
function coupling = get_coupling(i, j)
coupling = [];
uv_this = [i; j];
while(1)
tmp = coupling_cell{uv_this(1), uv_this(2)};
coupling = [tmp(:, 2) coupling];
uv_last = tmp(:, 1);
if (uv_last(1) == 1 )
break;
else
uv_this = uv_last;
end
end
end
function get_c(i, j)
if (CA(i, j) > 1) % value already available
return;
end
d = dist_fcn(P(:, i), Q(:, j));
if (i == 1 && j == 1)
CA(i, j) = d;
coupling_cell{i, j} = [1 i; 1 j];
elseif (i > 1 && j == 1)
get_c(i  1, j);
CA(i, j) = max([CA(i  1, j), d]);
coupling_cell{i, j} = [i  1, i; j, j];
elseif (i == 1 && j > 1)
get_c(i, j  1);
CA(i, j) = max([CA(i, j  1), d]);
coupling_cell{i, j} = [i, i; j  1, j];
elseif (i > 1 && j > 1)
get_c(i  1, j  1);
get_c(i  1, j);
get_c(i, j  1);
[val, ind] = min([CA(i  1, j  1), CA(i  1, j), CA(i, j  1)]);
switch ind
case 1
coupling_cell{i, j} = [i  1, i; j  1, j];
case 2
coupling_cell{i, j} = [i  1, i; j, j];
case 3
coupling_cell{i, j} = [i, i; j  1, j];
end
CA(i, j) = max(val, d);
end
end
end
Zachary Danziger (view profile)
Lingji,
I cannot see a way to calculate the coupling distance during the recursive calls to c(i,j) without additional overhead. I feel that adding computational cost is not justified because, in general, the coupling sequence is not unique and therefore is not especially informative. The coupling sequence is just any allowable (i.e., follows the forward movement rules) sequence between points on P and Q that never has distance greater than cm. In the typical case there are a great many allowable sequences because interchanging many points with small distances does not affect cm if there is a much larger distance later in the sequence.
Here is my compromise: if the user requests the coupling sequence then I calculate the cm as usual, and at the end of the code I loop through the CA variable choosing one workable coupling sequence. I hope this pleases the fans.
Lingji Chen (view profile)
Zachary,
I'm not quite sure about your question, but let me try to answer anyway:
The "ending point" recursion for the chain should parallel the value recursion which stores the results in CA (without repeating the work already done). In this case you would need a second variable to store where the "last time" ending point was. I am saying "this time" and "last time" because the recursion goes backwards. For example, in the (i > 1 && j > 1) case, the ending point is (i,j), but depending on which of the 3 cases has the min value, the "last time" could be (i1, j1), (i1, j) or (i, j1).
Hope this helps.
Zachary Danziger (view profile)
Lingji,
Thanks for the suggestions. What is the best way to tell when the algorithm has moved to the next point in the chain ("the next time") without storing the data for every recursion?
Lingji Chen (view profile)
Zachary,
Here are some more suggestions:
1. You can add a second return value that gives the coupling sequence. More on this below.
2. You can move the distance function handle into the call list, as an optional third argument. This makes it more versatile without incurring extra cost.
3. To get the coupling sequence, you can add a cell array at the level of CA (by the way, persistence is not needed), with each cell recording two things: an index into the cell array that gives where the coupling sequence ends "last time," and the new coupling pair that is added "this time."
4. You can change the c() function to do two things: modifying CA, and modifying the cell array, during the ifthenelse branches. Return value is not needed.
5. After a call to c(p, q), you can get the distance from CA(p, q), and trace backwards the coupling sequence starting from the cell array at (p, q).
Hope this is clear enough.
Zachary Danziger (view profile)
Lingji,
I misunderstood your comment. Yes, the extra inputs to the c function are not needed. I have resubmitted the file with these removed for greater speed. Thank you for your suggestion.
Zachary Danziger (view profile)
Lingji,
Unless one uses global variables for P, Q, and dfcn, the internal coupling search function, c, will not see them in its workspace.
I hope I understand your question.
Lingji Chen (view profile)
Very good, thanks!
Since you use an inner function to implement c(i, j), maybe the parameters P, Q and dfcn can be omitted?
Jiacai Pan (view profile)
thank you for my sharing your file
Jiacai Pan (view profile)
very good, thank you