Pairwise Haversine Distance Matrix...Speed Up?

Hello,
I have the following code which produces an N x N matrix of pairwise distances using the Haversine formula where the inputs are the latitude and longitude coordinates of a certain geography, e.g. counties. Here is the function:
function W = knn_weight_matrix(xc,yc,k)
%%PURPOSE: The function builds a k nearest neighbor N x N row-stochastic spatial weight matrix
% The function returns a sparse matrix W representing the spatial weight
% matrix
%%USAGE: W = knn_weight_matrix(xc,yc,k)
% where the following holds:
% xc = an (n x 1) vector of longitude coordinates (in degrees)
% yc = an (n x 1) vecotor of latitude coordinates (in degrees)
% k = the number of nearest neighbors
%%NOTES: The code uses the haversine distance formula to create an
% N x N matrix of pairwise diatances
%%Rearrange coordinates and pre-allocate matrices
coordinates = [yc xc];
n = length(xc);
tempw = zeros(n,n);
nnlist = zeros(n,k);
temp2 = zeros(n,n);
%%Create pairwise distance matrix using Haversine distance calculation
for i=1:n
for j=1:n
temp1 = haversine(coordinates(i,1),coordinates(i,2),coordinates(j,1),coordinates(j,2));
temp2(i,j) = temp1;
end
end
%%Find indices for the nearest neighbors
for i=1:n;
temp1 = temp2(i,:);
[~,xind] = sort(temp1);
nnlist(i,1:k) = xind(1,2:k+1);
end
%%Clear the distance matrix
clear temp2
%%Assign a value of 1 to nearest neighbors
for i=1:n
tempw(i,nnlist(i,:)) = 1;
end
%%Create row-stochastic and sparse spatial weight matrix
W = (tempw./k);
end
%%Haversine Function
% This function uses the Haversine formula to calculate the pairwise
% distances used in the knn_weight_matrix formula.
function [dist] = haversine(lat1,lon1,lat2,lon2)
dist = 2 * 6372.8 * asin(sqrt(sind((lat2-lat1)/2)^2 + cosd(lat1) * cosd(lat2) * sind((lon2 - lon1)/2)^2));
end
Is there a way to "bsxfun" or otherwise speed up the code? It runs pretty briskly with a small number of coordinates but it gets fairly slow when N>50,000 or greater.
Any help would be greatly appreciated and please use the code if you find it useful.

Answers (1)

Donald - before considering using something like bsxfun, why not try and optimize the code that you already have? For example, the following calculates the pairwise distances between two coordinates
%%Create pairwise distance matrix using Haversine distance calculation
for i=1:n
for j=1:n
temp1 = haversine(coordinates(i,1),coordinates(i,2),coordinates(j,1),coordinates(j,2));
temp2(i,j) = temp1;
end
end
It seems that you could reduce the above calculations in half because the haversine distance between coordinates i and j is the same as the distance between coordinates j and i. So you could do the following instead
temp2 = zeros(n,n);
for u=1:n
for v=r+1:n
temp1 = haversine(coordinates(u,1),coordinates(u,2),coordinates(v,1),coordinates(v,2));
temp2(u,v) = temp1;
temp2(v,u) = temp1;
end
end
Note that the above uses u and v instead of i and j as MATLAB uses the latter two as representations of the imaginary number.
You may also be able to simplify
%%Find indices for the nearest neighbors
for i=1:n;
temp1 = temp2(i,:);
[~,xind] = sort(temp1);
nnlist(i,1:k) = xind(1,2:k+1);
end
to something like
% sort all rows of temp2
[~,xind] = sort(temp2,2);
% extract first k columns of each (ignoring the first element which
% is the distance from a coordinate to itself)
nnlist = xind(:,2:k+1);
Start with the above and see if it helps improve performance.

1 Comment

Dear Geoff,
Thank you for the advice on my code. I tried this and the timing improves but I think that the issue is that there are a number of calls to the Haversine function which may be the key. I have code that calculates the pairwise distances using the standard Euclidean formula and it's much faster, something like this:
for i=1:n;
xi = xc(i,1);
yi = yc(i,1);
dist = (xc - xi*ones(n,1)).^2 + (yc - yi*ones(n,1)).^2;
[xds xind] = sort(dist);
nnlist(i,1:m) = xind(2:m+1,1)';
end;
The Haversine formula is preferred in many situations but this may be why the code is slower.
Thank you again for the help as it is much appreciated. I learned a lot from your example code!

Sign in to comment.

Asked:

on 30 Jan 2015

Commented:

on 3 Feb 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!