Why matlab benchmark extreamly slow on AMD EPYC CPU

Hello, I have a Rack server with AMD EPYC 75F2 (16core 3.5GHz) CPU + 512G 3200MHz ECCRAM + P620 GPU +windows server 2019, build for CPU based finiate elements simulation. After I recieved the mechine, I tried t = bench (on MATLAB 2020b) to see the performance of it, but result it really bad, it can't even fight $100 Ryzen 1700, but hardware check gives no hardware error, other softawre such as COMSOL also behaves bad, can someone help me with this? Thank you!

3 Comments

Check maxNumCompThreads, does it return the maximum number of physical cores?
However, look for more information on simulation forums, maybe you'll find more information there.
It might be worth experimenting with the work-around at
R2020a and later are supposed to have the fix in place, but it is hypothetically possible that the fix does not properly detect your particular hardware.
Yeah I think that's possible, it may be fixed on AMD Ryzen, but still have some issue on EPYC, hope Mathwork or AMD can fix this problem...

Sign in to comment.

Answers (3)

Hopefully mathworks will actually fix this problem. It seems to be specifically localized to Linux based Epyc systems. If you utilize a Windows OS for your Epyc system on R2021A Update 3 or R2021B you will get the expected levels of performance out of your system.
It appears right now though, at least on CentOS 7.8 installs, Epyc CPUs have improperly defined vectorization characteristics. A given vectorized operation e.g.:
A = magic(20000);
[L,U,P] = lu(A);
May take 10x longer on the exact same AMD Epyc CPU on Linux compared to the same CPU running in Windows. Other functions with notable problems include isonormals, gradient, and bwareaopen to name a few. It appears to only run on a single thread in those sections of code rather than properly multithreading vectorized code operations.

10 Comments

I have tested with matlab 2021a update4(9.10.0.1710957), there is no differerce with origin matlab 2021a(9.10.0.1602886). The system is dual EPYC 7452.
What version of Linux did you run on (include the distribution and kernel please)? Mathworks identified the kernel level optimizations as a potential source of the runtime differences, I'm curious what you were using in order to check that. Do you mean you ran both MATLAB 2021A Update 4 and 2021 orignal in just Windows or just Linux and noticed no performance differences? Or that you noticed no difference between the performance of MATLAB in Windows and MATLAB in Linux?
I ran both MATLAB 2021A Update 4 and 2021 orignal in just Windows sever 2019 DC and there is no performance differences . Also I ran 2021 orignal in ubuntu 20.04, it is slower than in windows.
Excellent. That's the same thing I'm seeing on my Epyc and Ryzen systems. Basically, Windows MATLAB runs substantially faster than Linux MATLAB on Zen2 and Zen3 cores. There's some kind of issue there. Mathworks only suggestion so far has been to try Ubuntu 21.04 or any distro with a more recent Linux kernel (5.11 or higher).
Should I assume from the comments that this issue is still unresolved? I have been thinking of buying a new computer to run Matlab applications. I've been leaning towards Epyc on Linux, but if there is still a performance problem I may need to choose differently.
I've seen it suggested that the Ryzen Threadripper parts don't suffer this problem? Can anyone confirm if that is true? Going with Threadripper Pro with the older cores (Zen 2 vs. Zen 3) would ordinarily seem sub-optimal, but it might be the best option if that can avoid the Matlab performance problems with Epyc.
As best I can tell it seems to be an issue with memory allocations in some Linux distros on some Linux kernels. If you're running a server that has a lot of memory utilization at default you will notice this poor performance, even on something as simple as calling a large array to be built with "ones".
At the time I was running test code on a system that had 256GB RAM, and usually sat around 150GB occupied between ZFS, mariadb, and all the other services running on a cluster on CentOS 7 with the 4.4 LT kernel from last year. I built a GPU compute node around an Epyc 7532 with 256GB RAM, which only sits around 8GB used most of the time, and it performs just as fast in Linux and Windows, and that's a default system running CentOS 7. My 5950X system at home also has no performance decrease moving from Windows to Rocky Linux. That's a preponderance of evidence suggesting to me that the Zen3 cores will be good in Linux if and only if you do not have high resting memory utilization, however I have not tested them.
Second to Kevin observation. To see if it is a potential Linux Kernal issue, try
tic;
a = ones(10000) ;
toc;
I have seen similar EPYC chip which takes 0.1 sec on Windows, and 8 secs on some linux distribution. Ones is as simple as you can get. Making one call to Malloc and writing to every memory location once.
If Malloc and writing to memory is slow, then most MATLAB functions that needs temporary workspaces will not perform well.
Thanks @Kevin, that is helpful.
From the comments, it seems the issue specific to Matlab is probably resolved, but there may be some CPU/OS/kernel issue associated with memory allocation and/or use under some circumstances (e.g. moderately high pre-existing memory usage). If I understand correctly, you think the issue affects both Zen2 and Zen3?
It is likely to be exasperating, but I suppose I have little recourse but to buy the system we want and then test our loads under different operating systems to see how it affects performance.
My sense is there’s no issue with the CPU at all; it’s purely down to the kernel/distro selection. I haven’t even been able to disprove whether this bug exists for Intel processors when similar memory utilization conditions exist. I would say just think about what sort of memory load you put on the system, and test how your MATLAB workloads coexist under different distros. My guess is if you use a bleeding edge kernel you’ll probably be better.

Sign in to comment.

gophi7
gophi7 on 3 Jan 2021
Edited: gophi7 on 3 Jan 2021
I am seeing the same issue with a rack EPYC cpu, and modifying maxNumCompThreads did not seem to help. Did you happen to learn what the issue is / if there is a workaround?

9 Comments

sorry I still have no clue on why is that happened... I tried to reach AMD, but still get no reply...
Me too! AMD EPYC 7302 (Dell PowerEdge R6525)
MatLab 2020b
Update 3 installed.
Now it's clear that there is something with EPYC... Can matlab hear us?
Good news: Intel seems to be adding Zen kernels
Will 2021a correct this behaviour? Or let us choose the BLAS distribution?
@Bernhard Wistawel If you have access to the prerelease you could already know the answer. My guesses for the answers: maybe, and probably not.
Even with BLAS GNU Octave is still substantially slower on Ryzen than Matlab. With increasing popularity of AMD this will be fixed in due time. I hope sooner than later, but for Epyc the market share might be too small to invest much engineering time.
Ok, will check soon. But MathWorks should let us choose which BLAS we want to use. The AMD BLIS could be much faster.

Sign in to comment.

With the original poor performance for Ryzen, it was possible to set an environment variable that had the effect of forcing the Intel MKL to use AVX2, greatly improving performance. Have you tried if this also works for EPYC? Procedure is described half-way down this post.

Categories

Products

Release

R2020b

Asked:

on 12 Nov 2020

Edited:

on 16 Mar 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!