To compute the Euclidean separation (L2 norm) between two sets of points in MATLAB can be slow and/or memory-hungry. In some cases (most particularly, if you are working with 2D, 3D or 4D data), this function will do it 2-4 times faster than the fastest m-code I've seen (due to Germano Gomes) and hundreds of times faster than a typical memory-efficient nested loop.
NB: for D much greater than 10-15, performance is better using GG's m-script. See the screenshot for a performance plot - green is GG, blue is mex_sepsq.
A = randn(4, 5000);
B = randn(4, 5000);
C1 = mex_sepsq(A, B);
Elapsed time is 0.201335 seconds.
C2 = sepsq_gg(A, B);
Elapsed time is 0.517755 seconds.
Relative time per implementation: 1.00 2.57
oh, i've not tried to compile on Win 64-bit, that will probably go awry. to fix (i think), search/replace _MSC_VER with NOTDEF in mex_sepsq.cpp, and try the compile again. you'll get the C version instead of the assembler version (but it's not that much slower). i'll fix this if i ever come across a Win64 install.
When I try and compile this on a Win 7, 64-bit machine using c++ Visual Studio 2008, get the a long list of the following type of errors,
c:\users\adam\documents\matlab\mex_sepsq\mex_sepsq_asm.cpp(57) : error C4235: nonstandard extension used : '__asm' keyword not supported on this architecture
c:\users\adam\documents\matlab\mex_sepsq\mex_sepsq_asm.cpp(61) : error C2065: 'push' : undeclared identifier
c:\users\adam\documents\matlab\mex_sepsq\mex_sepsq_asm.cpp(61) : error C2146: syntax error : missing ';' before identifier 'eax'
c:\users\adam\documents\matlab\mex_sepsq\mex_sepsq_asm.cpp(61) : error C2065: 'eax' : undeclared identifier
Forgot to add c++ compiler is Visual Studio 2008.
Wow, what a great function "bsxfun" - . I'd never seen that before (it wasn't available in 2003 when this was written). With that now available, computing this (and other similar tasks) in m-script becomes a lot more efficient, you're right.
I had a play around, and mex_sepsq (the latest version) is a fair bit faster than m-script for D up to around 10 or 15, at which point Matlab's optimised and parallelised matrix libraries get the better of it. In any case, against "bsxfun", the speed-up factor is in 2-4 rather than 15-20, so it's less important. It still has better memory usage, though, using about a third less memory than the m-script implementation.
I've improved the code quite a lot in the process, so I'll upload the latest version - the new code is 2-3 times as fast as the 2003 version for large N, and outperforms the m-script implementation by a factor of 2-3 as well (i.e. the 2003 code performed similarly to your m-script implementation for large N).
I'll also try and include the plots of performance in my upload. It shows the domains where mex_sepsq is worth using. This domain, where mex_sepsq gives a considerable benefit, centres on the obvious 2<D<4 range, so I won't give up on it yet.
Thanks for your interesting comment :)
Thanks for your comment. There are at least a couple of reasons why I generally store matrices the DxN way, rather than NxD, but neither affects mex_sepsq. I could potentially provide a flag to mex_sepsq to work on data stored the other way round, if you were interested.
Great function, but its kinda unnatural to use D x N and D x M column vectors. I would argue that the dimensionality should form the columns, so N x D and M x D looks more appropriate.
I believe that in 3D something like this is almost as fast for number of points >1000, ratio is approximately =2.
function [Dist] = Distance3DFast(M1,M2)
% M1 is a 3xN matrix
% M2 is a 3xM matrix
%Dist is a NxM matrix
% M1=rand(3, 10000); M2=rand(3,10000);
Dist = sqrt(bsxfun(@plus,dot(M1,M1)',dot(M2,M2))-2*M1'*M2);
Very fast indeed, thanks for the great function.
It is really fast! Thanks.
whoops, broke the linux implementation - fixed, now.
Improved performance, added benchmarking.
Provided a demo and an auto-compile script.