Speeding up matrix expotentials by using GPU

Hey all:
I am trying to accelerate the speed of calculation of high dimisional matrix expotential by using GPU, but I find that the speed of calculating them on CPU is faster than GPU, and I can't find where the problem is. The code is:
dev = gpuDevice();
CPU_time = 0;
GPU_time = 0;
for i = 1:10
CPU_matrix = rand(4096, 4096);
GPU_matrix = gpuArray(complex(CPU_matrix));
tic;
Exp_CPU = expm(-1i * CPU_matrix);
CPU_time = CPU_time + toc;
tic;
Exp_GPU = expm(-1i * GPU_matrix);
GPU_time = GPU_time + toc;
end
disp("CPU time:" + string(CPU_time));
disp("GPU time:" + string(GPU_time));
I tested this code using my computer, and its CPU configuration is: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz 2.59 GHz, RAM 16 GB. Its GPU configuration is: NVIDIA GeForce GTX 1650. The final result is:
CPU time:452.1338
GPU time:915.5892
Why the speed of GPU is slower than CPU?
Thanks

 Accepted Answer

Your GTX 1650 is designed for single precision computing. In single precision it has a peak performance of about 3 teraflops, whereas its double precision performance is just 90 gigaflops, a 1/32nd of that. A back-of-the-envelope calculation would give your CPU a performance of around 130 gigaflops in double precision.
In single precision (rand(4096,4096,'single')) your code runs about 8x faster on the GPU than on the CPU. On a card designed for double precision computation such as the Titan V, it can also achieve this improvement in double precision.

5 Comments

Does the fact that the input to expm is complex make any difference (on either the CPU or GPU)?
I tested the single precision computing, and its result is:
CPU time:222.679
GPU time:37.9272
It is about 6x faster. Does the complex input make any difference?
Not really. The same compromises made to perform complex operations that are needed on GPU will be needed on CPU.
expm is a complicated function and needs to do quite a bit of serial computation. I think 6x is a good result.
This suggests the inexpensive solution may be to just use single precision, convert the matrix to single, and it will go much faster. But remember the cost of doing so. This is a tradeoff between speed and precision. Your computations will lose precision. And for some of what one does in MATLAB, we might afford to use single precision. It depends on how much you need that precision. The result is that a picture will become a little less sharp. Edges a little less crisp. Sharp transitions in a curve may now exhibit visible oscillations. Essentially, you can start to lose the fine detail in what you do. So you will need to watch, make sure the use of single precision does not push you over the edge of acceptability.
What are you doing with ExpGPU (or ExpCPU) downstream of the expm compuation? For example, if multiplying it by a vector, or a couple of vectors, maybe expmv would be of use.

Sign in to comment.

More Answers (1)

Thanks a lot! Now I know where the problem is and maybe I will try to use single precision and some better GPUs. You all really help me a lot.

Products

Release

R2024a

Asked:

on 18 Aug 2024

Answered:

on 20 Aug 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!