Excellent work Bruno. Many thanks.
Quick question. How much faster it would be with the GPU jacket enabled?
I try GPUmat using fft2 to programme similar code but it turns out to be slower.
I am thinking to get MATLAB Parallel Computing Toolbox to run the GPU if it is a lot faster.
I hope it is.
Could you please give me some idea? Say
A = rand(1000,1000);
B = rand(1000,1000);
tic;C=convnfft(A, B, 'same', [1, 2],'false');toc
given me> 0.213153 seconds without GPU