Selecting specific GPUs for parpool
9 views (last 30 days)
Show older comments
Jan Jaap van Assen
on 7 Aug 2017
Edited: Jan Jaap van Assen
on 9 Aug 2017
I have a question concerning the selection of GPUs using parpool and the Neural Network Toolbox. I'm using a system with 4 GPUs and share this system with other people. We would like to specify which GPUs are used in parpool.
For the use of a single GPU we use: gpuDevice(n);
Parpool, initialized by the ('ExecutionEnvironment','multi-gpu') option for trainNetwork() by default uses the max 4 workers but you can specify a specific amount, always using the first GPU and up: parpool('local', 2);
When two instances of parpool are initialised in two separate instances of Matlab it still uses only these first two GPUs. So two separate training sessions running on the same two GPUs.
I can't find anything on specifying specific GPUs for the par pool. So that user one can use GPU[1 2] and user two can use GPU[3 4].
Any help would be great.
System details:
- MATLAB Version: 9.2.0.556344 (R2017a)
- Parallel Computing Toolbox, Version 6.10
- Neural Network Toolbox, Version 10.0
- Linux 4.4.0-59-generic (14.04.1-Ubuntu)
- 4x GTX 1080
- 2 x Intel Xeon E5-2630v4
- 128Gb DDR4
0 Comments
Accepted Answer
Joss Knight
on 7 Aug 2017
Edited: Joss Knight
on 7 Aug 2017
Start your pool, select the devices, then run trainNetwork. It will use the pool you already have open and won't change the selected GPUs.
parpool('local',2);
spmd, gpuDevice(labindex+2); end
trainNetwork(..., ..., trainingOptions(..., 'ExecutionEnvironment', 'multi-gpu'));
An alternative is to use MDCS to create a cluster on your multi-GPU machine, from which users can open pools. During cluster startup you can force each worker to have access to a particular GPU, using the CUDA_VISIBLE_DEVICES environment variable:
setenv CUDA_VISIBLE_DEVICES <n>
You can even do this in MATLAB in the worker startup script as long as it runs before the GPU driver is loaded:
setenv('CUDA_VISIBLE_DEVICES', sprintf('%d', workerIndex-1));
I'm not quite sure how to get the workerIndex during worker startup but I'm sure it's possible.
The advantage of this (since you may be wondering!) is that it doesn't require the users to know which GPUs are available for them to use. If you only have local workers available then you could issue system queries to the nvidia-smi command to find out which GPUs are unused.
A third way is to set the GPUs into EXCLUSIVE_PROCESS compute mode. The first process to open a CUDA context will get use of the GPU and it will be unavailable to the others. This probably involves the least work for the users and doesn't require MDCS - but it is easily exploited, allowing the first user onto the machine in the morning to hog all the devices.
nvidia-smi -c EXCLUSIVE_PROCESS
Another downside is that if you use the GPU on your client MATLAB then that GPU isn't available to the pool. Again, this technique is doubly useful if you run an MJS cluster on the machine on which multiple users can open pools.
1 Comment
More Answers (0)
See Also
Categories
Find more on Parallel Computing Fundamentals in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!