Assigning gpuArrays to different graphics cards
2 views (last 30 days)
In the example below I use a parfor loop to assign a different 256x256x256 (random k-space) matrix to each of my 2 GPUs. Theoretically, I can then process these matrices in parallel on the 2 GPUs (here I've done an ifftn). The problem is that the parfor loop is very slow (presumably overhead related). This operation takes 6.5 seconds on my machine. If I replace the 'parfor' with a simple 'for' (and run 1 GPU sequentially), this operation takes 0.2 seconds.
Is there an easy (fast) way to assign a gpuArray to a specific graphics card - such that future operations on this gpuArray will use the specified graphics card? Rather than using parfor, I would prefer to simply use asynchronous CUDA kernels (invoking one directly following the other in Matlab code) to run both GPUs in parallel.
Dims = [256,256,256,2];
Kspace = complex(rand(Dims,'single'),rand(Dims,'single'));
Image = gpuArray(complex(zeros(Dims,'single')));
parfor n = 1:2
gKspace = gpuArray(Kspace(:,:,:,n));
Image(:,:,:,n) = ifftn(fftshift(gKspace));
Joss Knight on 22 Feb 2020
There is no way to do what you ask. Selecting a GPU is the only way to move data there, and selecting a GPU resets all GPU data.
The issue here is the way you're sending all the data to each worker and then indexing it, this is your bottleneck (and equivalently, moving all the results back). You need to amortise this communication cost, either by doing more work inside the loop or by loading the data you need directly onto each worker without first loading it onto the client.
Presumably you have more than 2 256^3 arrays. Put another loop inside your parfor and process all those arrays together. Move the results back to the CPU to save GPU memory. Eventually the communication overhead will be irrelevant and you'll see the gain of use of both your GPUs.