Parallel computing cluster with CPU and GPU

7 views (last 30 days)
AlessioX
AlessioX on 13 Nov 2015
Edited: Jakub Sikorowski on 11 Mar 2019
Hi there, I have a computer with 2-cores CPU and a CUDA GPU and I would like to create a parallel cluster for MapReduce and such.
I was just wondering how can I create such cluster? I can easily use the Cluster Configuration on 'local', would that be enough? I know that CPU and GPU support different data structures (e.g. array vs gpuArray) but can they work together in a parallel cluster?
Let's say I want to create some MapReduce scripts, configuring and using the 'local' cluster will actually use both the CPU and GPU? Or is there anything else I should care about?
Thanks a lot!

Answers (1)

Jakub Sikorowski
Jakub Sikorowski on 10 Mar 2019
Edited: Jakub Sikorowski on 11 Mar 2019
Hi AlessioX,
It sounds like a nice challenge! Write a code that uses both GPU and CPU and make it run faster then on the GPU alone. My laptop laptop has Intel i5-8250 and GeForce MX150 which is quite slow, which should make it easier.
Taking on the problem of random walk to price a stock defined in Joss's link. Trying to optimise the number of CPU workers, and the ratio of work per CPU and GPU, I got a speed up of (mind bogling) 1-5% (running on both CPUs and GPU compared with only on GPU)... and on any reasonable GPU the speedup would most likely be drawned by any overhead...at least for this example of MapReduce.
The code adapted from Joss's link to work on both CPUs and GPUs is below.
function speedup = runSimulationOnCPUsandGPUs(nSamples, cpuRatio)
% Executes nSamples simulation per GPU, and nSamples/cpuRatio
% on CPU per CPU worker. In the parallel pool we have one worker
% per GPU plus noCpuWorkers for CPUs. The output is the speedup
% over running the code on a single GPU
% reset GPUs
gpuDevice([]);
% time the work done by CPUs and GPUs together
tic;
noCpuWorkers = 1; % experimentally noCpuWorkers seems to be the best
nIter = gpuDeviceCount()+core_count;
parfor ix = 1:nIter
if ix <= core_count
meanFinalPrice(ix) = runSimulationOnOneCPU(ceil(nSamples/cpuRatio));
else
meanFinalPrice(ix) = runSimulationOnOneGPU(nSamples);
end
end
mfp = mean(meanFinalPrice);
gpudev= gpuDevice();
% wait until GPUs are done
wait(gpudev);
CpusAndGpusTime = toc;
% reset GPUs
gpuDevice([]);
% time the work one GPU
tic;
meanFinalPrice = runSimulationOnOneGPU(ceil(nSamples/cpuRatio)*core_count+nSamples);
mfp = mean(meanFinalPrice);
gpudev= gpuDevice();
% wait until GPUs are done
wait(gpudev);
oneGpuTime = toc;
speedup = oneGpuTime/CpusAndGpusTime;
One needs to also define how to run simulation on CPU:
function mfp = runSimulationOnOneCPU(Nsamples)
% Run a single stock simulation on the CPU and return the
% mean file price on the CPU.
stockPrice = 100; % Stock price starts at $100.
dividend = 0.01; % 1% annual dividend yield.
riskFreeRate = 0.005; % 0.5 percent.
timeToExpiry = 2; % Lifetime of the option in years.
sampleRate = 1/250; % Assume 250 working days per year.
volatility = 0.20; % 20% volatility.
% Create the input data.
startPrices = stockPrice*ones(Nsamples, 1);
riskFreeRate = riskFreeRate*ones(Nsamples, 1);
dividend = dividend*ones(Nsamples, 1);
volatility = volatility*ones(Nsamples, 1);
timeToExpiry = timeToExpiry*ones(Nsamples, 1);
sampleRate = sampleRate*ones(Nsamples, 1);
% Run all Nsamples simulations on a CPU using arrayfun.
finalPrices = arrayfun( @simulateStockPrice, ...
startPrices, riskFreeRate, dividend, volatility, ...
timeToExpiry, sampleRate );
mfp = mean(finalPrices);
end
The code for simulateStockPrice() and runSimulationOnOneGPU() was not changed from link. The results:
>> runSimulationOnCPUsandGPUs(10^6,20)
ans =
1.0102
>> runSimulationOnCPUsandGPUs(10^6,20)
ans =
1.0549

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!