Unexpectedly slow performance of eig with parfor
Show older comments
I have a computation that requires the repeated evaluation of the eigenvalues and eigenvectors of many 800x800 matrices. From profiling the code I can see that around 70% of the execution time is spent executing the eig() command so this is my first target for improving execution speed. The matrices are single precision floating points, dense and in general lack any convenient symmetry (as such the eigenvalues are complex). I am trying to use parallel processing to speed up execution and whilst I am seeing improvements the increases in speed are rather less than I was expecting (or hoping for) - at best I'm getting a factor of 5-6x improvement, despite having available to me an AMD Ryzen 2990WX CPU with 32 cores.
I realise that the speed-up one can expect with parallel processing depends on many factors, including the overhead for setting up the task, communication between processes, competition for shared resources such and memory and bandwidth and trade-offs against any implicit parallelism from the use of multithreading by Matlab's calculations. However, I believe for this task that the overhead requirements are small compared to the calculation itself and from what I can tell eig() is not a function that benefits hugely from multithreading (see below - the optimum appears to be a few threads rather than many but the differences are relatively small).
My hope is that there is something wrong with what I'm doing. I've done some benchmarking with some very simple code which adequately demonstrates the disappointing scaling with cores (see below). The results are given below and show both the modest variation with number of compuational threads and disappointing speed-up with respect to the number of workers. Given the simplicity of the code I'm struggling to see what else I could do or what I could be doing wrong, but I was hoping for much better. I would be very grateful for any suggestions on how this situation may be improved.
Test code:
m = single(rand(800)); % note the same m is used for all tests throughout this trial
%%
nT = 1; % number of threads: 1,2,4,8, or 16 (32 & 64 also done for for loop)
nP = 4; % number of workers: 1,2,4,8,16 or 32
p = parpool(nP)
tic; parfor il=1:32; maxNumCompThreads(nT); [a,b] = eig(m); end; t = toc, t/32
Test Results:

Accepted Answer
More Answers (1)
David Crosby
on 6 Jun 2019
0 votes
Categories
Find more on Parallel Computing Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


.png)