Invoking my kernel in Matlab affects fft2 speed on GPU?

1 view (last 30 days)
Hi, I happened to test such kind of program in Matlab. The code snippet is just like this:
tic; fft2(img_d); toc % img_d is a gpuArray on GPU
.......
%Set the block and grid sizes of my kernel
........
% invoke my kernel
tic
img_d=feval(myKernelName, .......);
toc
% do fft2 on gpu again
tic; fft2(img_d); toc % img_d is a gpuArray on GPU
The result is stange:
the first fft2 on gpu costs:
Elapsed time is 0.000557 seconds.
but after invoke my kernel ,the second fft2 on gpu costs:
Elapsed time is 0.074028 seconds.
If I add such line after invoking line:
img = gather(img_d)
and then measure the time for the second fft2 on gpu the time looks right:
Elapsed time is 0.000392 seconds.
What is wrong with my kernel? or what is wrong with feval function?

Answers (1)

Jill Reese
Jill Reese on 11 Apr 2012
In R2012a, all GPU calculations run asynchronously with MATLAB. Because of this a new command was introduced to facilitate accurate code timing by synchronizing MATLAB and the GPU (see the R2012a release notes for more details). The form of the command is
wait(gpudev)
where gpudev is the object representing the GPU device to wait for. You will need to change your timings like so:
gpudev = gpuDevice();
tic;
f = fft2(img_d);
wait(gpudev);
toc
%Set the block and grid sizes of my kernel
% invoke my kernel
tic
img_d=feval(myKernelName, .......);
wait(gpudev);
toc
% do fft2 on gpu again
tic;
f = fft2(img_d);
wait(gpudev);
toc
The code above should result in more sensible timings.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!