GPU memory tracking and out-of-memory estimation

31 views (last 30 days)
Hello, So I am using GPU computing to vastly speed up some of my computations, but keep running into "out-of-memory" errors. I need to either predict when an operation will result in a GPU memory issue (so I can break up the problem further), or fix my code to avoid such issues. [Note, I am on a Titan-X w/ 12 GB onboard; part of the issue is to make sure my code won't crash on lighter power machines.
As such, I have a few questions:
(A) How do I estimate the GPU memory load of certain operations. e.g. I have two large [M,N,O] gpuArray's and I need to multiply them; the memory requirement seems to be more than just 3x the memory footprint of one of those matrices (each input, and one output). The same question applies to fft, abs, conj, and any other matrix operation I want to perform on the GPU.
I know I can check available memory using gpuDevice(). If I know how much memory an operation will require, I can predict if I will over-run.
(B) When a function returns, does it automatically free the gpu memory allocated during its run? If it does, I don't understand what is eating away at my AvailableMemory; the variables in my workspace at the stage I crash come nowhere close to what is occupied on the card.
(C) Is there a way to find out what Matlab variables might still be on the card and their actual size? I have been using 'gather' to pull the variables down, and then 'whos' to estimate their size; however I assume there is not a one-to-one relationship on and off the card.
(D) Without looping through current variables in the workspace and doing a class check, is there a fast way to pull all current variables off of the card, perform reset(gpuDevice) and then load them back on? [Probably not, I should really just write a function that does just that]
Thanks, -Dan

Accepted Answer

Joss Knight
Joss Knight on 5 Feb 2017
(A) There really isn't any way other than running the function and monitoring GPU memory in a separate process. For instance, you could run nvidia-smi in a terminal, or install a GPU monitoring app. Every function is different in the amount of working memory it needs to run. PLUS, for instance, only needs space to create the output, but FFT can need 2x - 8x the input size depending on the signal length (pick a prime number and you're out of luck).
(B) Yes, if the function needed working memory it will be released at the end. This is complicated by the GPU driver's use of a caching allocator for small chunks of memory (< 1KB). These can cause memory to be allocated in larger chunks than you need and then not released; or for no new memory to be needed at all because it was able to allocate out of the cache. MATLAB has a similar cache but it's careful to negate the effects of this when reporting AvailableMemory.
(C) We've thought about having whos display the size of the gpuArray, but there's an equal pull not to confuse the user by showing memory usage that isn't main system memory. But it's pretty simple to work out because a gpuArray is always numeric and we only store the actual array data on the GPU. For array X you need numel(X) * datatypeWidth(X) bytes, where datatypeWidth is 8 for double, 4 for single etc, and is doubled for complex numbers.
(D) See people's answers to your other question.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!