IFFT slow down with using gpuArray
3 views (last 30 days)
Two sets of data A (4096 x 1024) matrix and B (32768 x 1024) matrix have been transferred to the GPU using gpuArray. A is passed into the FFT function and has shown a significant speed increase in comparison to the CPU A data. B is passed into the IFFT function and has shown approximately a 50% decrease in efficiency in comparison to the CPU B data. Is there a reason why the IFFT function does not have the speed increase proportional to the FFT function? I understand the sizes differ but I do no understand why the GPU implemented IFFT is slower then the CPU implemented IFFT. Also, the tic toc function and the run and time function were used to time the results. Thank you for your help.
More Answers (1)
James Lebak on 3 May 2013
Edited: James Lebak on 4 May 2013
The GeForce GT630M is a mobile graphics card. Frequently, these cards don't perform as well in double-precision as they do in single-precision. If your application can handle single-precision, you can try the IFFT in single and see if that gives you better performance. If you need double precision performance, you might want to try a different card.
This especially applies if the card in question is compute capability 3.0. You can find out the compute capability of the card in MATLAB from the structure returned by 'gpuDevice'.
Edit: removed incorrect identification of the 630M.