Code covered by the BSD License

### Highlights fromFFT-based convolution

5.0

5.0 | 9 ratings Rate this file 84 Downloads (last 30 days) File Size: 4.99 KB File ID: #24504

# FFT-based convolution

21 Jun 2009 (Updated 16 Sep 2009)

Discrete convolution using FFT method

File Information
Description

As opposed to Matlab CONV, CONV2, and CONVN implemented as straight forward sliding sums, CONVNFFT uses Fourier transform (FT) convolution theorem, i.e. FT of the convolution is equal to the product of the FTs of the input functions.

In 1-D, the complexity is O((na+nb)*log(na+nb)), where na/nb are respectively the lengths of A and B.

Optional arguments to control the dimension(s) along which convolution is carried out.

Slightly less accurate than sliding sum convolution.

Good usage recommendation:
In 1D, this function is faster than CONV for nA, nB > 1000.
In 2D, this function is faster than CONV2 for nA, nB > 20.
In 3D, this function is faster than CONVN for nA, nB > 5.

Acknowledgements

This file inspired Matching Pursuit For 1 D Signals and Conv2fft Reuse.

MATLAB release MATLAB 7.8 (R2009a)
Tags for This File
Everyone's Tags
Tags I've Applied
Comments and Ratings (18)
21 May 2013

Bruno, sorry, I forgot to mention that I used 2012b. Moreover, speed difference might depend on inputs and I'm lucky with the inputs I use?

16 May 2013

Petr, I just test with 2012a, the recommendation stands.

15 May 2013

Hi Bruno, as for usage recommendations, are they updated for new MATLAB releases? In case of my application for 2D, both nA and nB are greater than 20 (about 200*300 each). However, MATLAB conv2 takes almost the same time as convnfft (even slightly faster).

19 Feb 2013

Thanks Bruno.

19 Feb 2013

The GPU acceleration depends on user's hardware. It is impossible to give reliable number without what is your computer setup. A test I run long ago shows an acceleration between 3-5 times.

19 Feb 2013

Sorry, Luc -> Luke

19 Feb 2013

Excellent work Bruno. Many thanks.
Quick question. How much faster it would be with the GPU jacket enabled?
I try GPUmat using fft2 to programme similar code but it turns out to be slower.
I am thinking to get MATLAB Parallel Computing Toolbox to run the GPU if it is a lot faster.
I hope it is.
Could you please give me some idea? Say
A = rand(1000,1000);
B = rand(1000,1000);
tic;C=convnfft(A, B, 'same', [1, 2],'false');toc
given me> 0.213153 seconds without GPU

tic;C=convnfft(A, B, 'same', [1, 2],'true');toc
time???
Thanks

05 Sep 2012

Hello again. Apparently it's Matlab filter2 function fault: "Given a matrix X and a two-dimensional FIR filter h, filter2 rotates your filter matrix 180 degrees to create a convolution kernel."
Why the rotation is needed? hack knows...

02 Sep 2012

Hi Bruno. Excellent contribution and elegant code.
A few small comments:
1) The speed up is smaller then the one you state- as conv is optimized both by both Мatlab and by CPU vendor (Intel in my case).
2) The convolution shift in your (and btw mine) is different form the one resulting from filter2 function. Try running the following code:

filt=zeros(50);
filt(1,1)=1;

convFFT=convnfft(img, filt, 'same', [1, 2]);
regConv=filter2(filt, img);

figure;
subplot(1, 2, 1);
imshow(uint8(convFFT));
title('FFT based convolution');
subplot(1, 2, 2);
imshow(uint8(regConv));
title('Matlab regular convolution');

16 Jul 2012
21 Apr 2012

Moreover, when I checked the result form this code, there are some different between this and convn for two 3d matrix, the 2 input matrix are both positive.
the result from this code has negative values while convn does not.

18 Mar 2012

Hi Michael,
May be the inplace times is no longer necessary for recent Matlab.

I remember implement that from a user request.

Thanks for the useful feedback.

16 Mar 2012

Hi Bruno, are you sure your inplaceprod() is (still) useful?

I'm pretty sure, MATLAB does A=A.*B in-place itself. I just compared my memory usage for very large A/B with both methods and there was no difference.
This post is from 2007: http://blogs.mathworks.com/loren/2007/03/22/in-place-operations-on-data/

In a heterogeneous environment, it is useful to avoid mex code for such small tasks. If you are a non-privileged user on a compute server it is really a mess when compiling fails due to compiler version, libraries or whatever.

A minor notice is that (i)fftn is faster than for-loops around 1D (i)fft calls. At least as long as the input and output are of the same size. So I got at least a little speed gain by replacing
for dim=dims
A = ifft(A,[],dim);
end
with
A = ifftn( A );
for the MATLAB ifft case.

Thank you for the code!

12 Feb 2012

I think this says it all...

>> tic;C = convn(Vs,Vs);toc;
Elapsed time is 473.103412 seconds.
>> tic;C2 = convnfft(Vs,Vs);toc;
Elapsed time is 1.351315 seconds.
>> max(max(max(abs(C-C2))))
ans =
5.2208e-15

Thanks so much for this!

09 Aug 2011
06 Mar 2011

Well written (IMHO).

02 Nov 2010

Awesome function! My code runs 60x faster now (thanks to your GPU support).

23 Jul 2010
23 Jun 2009

correct bug when ndims(A)<ndims(B)

02 Sep 2009

GPU/Jacket capable

03 Sep 2009

GPU unable by default + changes in help section

16 Sep 2009

Option allows to disable padding to next power-two. Mex implement inplace product that saves about 1/3 memory. These two enhancement might be useful when perform convolution with very large arrays.