error running LAPACK with intel mkl with automatic offloads over Xeon phi

2 views (last 30 days)
Hi, I have set up a computer running on xeon E52600v2 and xeon Phi. I installed intel's mkl as it is explained in a tutorial by Intel such that matlab can automatically offload linear algebra routines (lu,cholesky,...) to the xeon phi.
I don't know if it is a xeon phi issue (I hope not and I don't think so) or a intel mkl issue but matlab crashes running the "bench" test routine when I have the following on my .profile:
#MKL export MKLROOT=/opt/intel/mkl export BLAS_VERSION=/opt/intel/composer_xe_2015/mkl/lib/intel64/libmkl_rt.so export LAPACK_VERSION=/opt/intel/composer_xe_2015/mkl/lib/intel64/libmkl_rt.so /opt/intel/composer_xe_2015/mkl/bin/mklvars.sh intel64 export MKL_MIC_MAX_MEMORY=16G export MKL_MIC_ENABLE=1 #export OMP_NUM_THREADS=16 #export MIC_OMP_NUM_THREADS=240 #export OFFLOAD_REPORT=2 # the value 1 is also available #export MIC_ENV_PREFIX=MIC #export MIC_KMP_AFFINITY=granularity=fine,explicit,proclist=[1-240:1] #export MIC_USE_2MB_BUFFERS=32K #export MIC_MKL_DYNAMIC=false #export KMP_AFFINITY=granularity=fine,scatter #export OFFLOAD_DEVICES=0 #export OFFLOAD_ENABLE_ORSL=1 export LD_LIBRARY_PATH="/opt/intel/mic/coi/host-linux-release/lib:${LD_LIBRARY_PATH}" export MIC_LD_LIBRARY_PATH="/opt/intel/mic/coi/device-linux-release/lib:/opt/inte/composer_xe_2015.0.090/compiler/lib/mic/:${MIC_LD_LIBRARY_PATH}"
I showed the commented environment variables as well as they might should have been uncommented?
I also join the matlab crash dump as attachment.
Anybody having an idea is my saver. :)
Thank you,
Jonathan
  3 Comments
Jonathan Berrebi
Jonathan Berrebi on 29 Oct 2014
Edited: Jonathan Berrebi on 29 Oct 2014
Hi Duncan, Exceuse-me for my late answer, I didn't get the email alert. So I checked manually now if someone answered.
Now I have sourced these environment variables, most of them being commented out:
#MKL export MKLROOT=/opt/intel/mkl export BLAS_VERSION=/opt/intel/composer_xe_2015/mkl/lib/intel64/libmkl_rt.so export LAPACK_VERSION=/opt/intel/composer_xe_2015/mkl/lib/intel64/libmkl_rt.so /opt/intel/composer_xe_2015/mkl/bin/mklvars.sh intel64 export MKL_MIC_MAX_MEMORY=16G export MKL_MIC_ENABLE=1 #export OMP_NUM_THREADS=16 #export MIC_OMP_NUM_THREADS=240 #export OFFLOAD_REPORT=2 # the value 1 is also available #export MIC_ENV_PREFIX=MIC #export MIC_KMP_AFFINITY=granularity=fine,explicit,proclist=[1-240:1] #export MIC_USE_2MB_BUFFERS=32K #export MIC_MKL_DYNAMIC=false #export KMP_AFFINITY=granularity=fine,scatter
#export OFFLOAD_DEVICES=0 #export OFFLOAD_ENABLE_ORSL=1 export LD_LIBRARY_PATH="/opt/intel/mic/coi/host-linux-release/lib:${LD_LIBRARY_PATH}" export MIC_LD_LIBRARY_PATH="/opt/intel/mic/coi/device-linux-releas/lib:/opt/intel/composer_xe_2015.0.090/compiler/lib/mic/:${MIC_LD_LIBRARY_PATH}"
I start matlab 14a on Centos 7 x86_64 and check that the env variable has been passed to matlab.and that the versions of blas and lapack are ok.
>> getenv('MKL_MIC_ENABLE')
ans =
1
>> version -blas
ans =
Intel® Math Kernel Library Version 11.2.0 Product Build 20140723 for Intel® 64 architecture applications
>> version -lapack
ans =
Intel® Math Kernel Library Version 11.2.0 Product Build 20140723 for Intel® 64 architecture applications Linear Algebra PACKage Version 3.4.1
Then I run the script that I am used to run on many computers (linux, windows,etc) and get the following error:
Failed 'Realign & Unwarp'
Error using chol Matrix must be positive definite. In file "/usr/local/spm12m14a/spm_imatrix.m" (v4414), function "spm_imatrix" at line 19.
If I run the "lu" function of MATLAB, then I get:
>> [L,U]=lu(A);
------------------------------------------------------------------------ Segmentation violation detected at Wed Oct 29 14:39:41 2014 ------------------------------------------------------------------------
Configuration: Crash Decoding : Disabled Current Visual : 0x21 (class 4, depth 24) Default Encoding : UTF-8 GNU C Library : 2.17 stable MATLAB Architecture: glnxa64 MATLAB Root : /usr/local/MATLAB/R2014a MATLAB Version : 8.3.0.532 (R2014a) Operating System : Linux 3.10.0-123.el7.x86_64 #1 SMP Mon Jun 30 12:09:22 UTC 2014 x86_64 Processor ID : x86 Family 6 Model 62 Stepping 4, GenuineIntel Virtual Machine : Java 1.7.0_11-b21 with Oracle Corporation Java HotSpot™ 64-Bit Server VM mixed mode Window System : The X.Org Foundation (11204000), display localhost:10.0
Fault Count: 1
Abnormal termination: Segmentation violation
Register State (from fault): RAX = 0000000000000000 RBX = 0000000000000064 RCX = 0000000000000020 RDX = 0000805fd34a24d8 RSP = 00007fafded223e0 RBP = 00007fafded224c0 RSI = 0000000000000000 RDI = 0000000000000064
R8 = 0000000000000048 R9 = 0000000000140000
R10 = 0000000000000000 R11 = 00007faff71da9c1
R12 = 0000000000000064 R13 = 00007fafd369adc0
R14 = 00007fafd34a2480 R15 = 00007fafded22850
RIP = 00007fafb12aecee EFL = 0000000000010202
CS = 0033 FS = 0000 GS = 0000
Stack Trace (from fault): [ 0] 0x00007fafb12aecee /usr/local/MATLAB/R2014a/bin/glnxa64/libmwmathlinalg.so+00343278 [ 1] 0x00007fafb12adfde /usr/local/MATLAB/R2014a/bin/glnxa64/libmwmathlinalg.so+00339934 [ 2] 0x00007fafee46cde3 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_dispatcher.so+00417251 [ 3] 0x00007fafee459874 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_dispatcher.so+00338036 ZN13Mfh_MATLAB_fn11dispatch_fhEiPP11mxArray_tagiS2+00000244 [ 4] 0x00007fafed69aba3 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+03754915 [ 5] 0x00007fafed5c4101 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+02875649 [ 6] 0x00007fafed5c50ae /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+02879662 [ 7] 0x00007fafed5d3019 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+02936857 [ 8] 0x00007fafed5d3183 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+02937219 [ 9] 0x00007fafed709172 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+04206962 [ 10] 0x00007fafed53fdf8 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+02334200 [ 11] 0x00007fafed59d30b /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+02716427 [ 12] 0x00007fafee4aac5f /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_dispatcher.so+00670815 ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2+00001087 [ 13] 0x00007fafed570135 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+02531637 [ 14] 0x00007fafed5370d9 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+02298073 [ 15] 0x00007fafed533dc7 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+02284999 [ 16] 0x00007fafed534193 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwm_interpreter.so+02285971 [ 17] 0x00007fafef2ddafc /usr/local/MATLAB/R2014a/bin/glnxa64/libmwbridge.so+00142076 [ 18] 0x00007fafef2de791 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwbridge.so+00145297 _Z8mnParserv+00000721 [ 19] 0x00007faff858992f /usr/local/MATLAB/R2014a/bin/glnxa64/libmwmcr.so+00489775 _ZN11mcrInstance30mnParser_on_interpreter_threadEv+00000031 [ 20] 0x00007faff856ab6d /usr/local/MATLAB/R2014a/bin/glnxa64/libmwmcr.so+00363373 [ 21] 0x00007faff856abe9 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwmcr.so+00363497 [ 22] 0x00007fafecc69d46 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwuix.so+00343366 [ 23] 0x00007fafecc4c382 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwuix.so+00222082 [ 24] 0x00007faff8cdf50f /usr/local/MATLAB/R2014a/bin/glnxa64/libmwservices.so+02323727 [ 25] 0x00007faff8cdf67c /usr/local/MATLAB/R2014a/bin/glnxa64/libmwservices.so+02324092 [ 26] 0x00007faff8cdb57f /usr/local/MATLAB/R2014a/bin/glnxa64/libmwservices.so+02307455 [ 27] 0x00007faff8ce09b5 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwservices.so+02329013 [ 28] 0x00007faff8ce0de7 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwservices.so+02330087 [ 29] 0x00007faff8ce14c0 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwservices.so+02331840 _Z25svWS_ProcessPendingEventsiib+00000080 [ 30] 0x00007faff856b098 /usr/local/MATLAB/R2014a/bin/glnxa64/libmwmcr.so+00364696 [ 31] 0x00007faff856b3bf /usr/local/MATLAB/R2014a/bin/glnxa64/libmwmcr.so+00365503 [ 32] 0x00007faff856628f /usr/local/MATLAB/R2014a/bin/glnxa64/libmwmcr.so+00344719 [ 33] 0x00007faff7519df3 /lib64/libpthread.so.0+00032243 [ 34] 0x00007faff724701d /lib64/libc.so.6+01007645 clone+00000109
If this problem is reproducible, please submit a Service Request via: http://www.mathworks.com/support/contact_us/
A technical support engineer might contact you with further information.
Thank you for your help.** This crash report has been saved to disk as /data/jonathan/matlab_crash_dump.39346-1
Caught MathWorks::System::FatalException [Please exit and restart MATLAB]>>
I tried to set OFFLOAD_REPORT to 1 or 2 as I saw examples on the web, but I don't know how to use this in matlab as it seems that the corresponding mkl_ function has to be included in the C-code before compilation.
So that is where I am right now without a clue. But, I noticed that one path included in the MIC_LD_LIBRARY did not exist: /opt/intel/mic/coi/device-linux-release/lib does not exist simply because /opt/intel/mic/coi/device-linux-release does not exist. Would that be a reason for failure?
Thank you, Jonathan
Jonathan Berrebi
Jonathan Berrebi on 29 Oct 2014
I have to add that I successfully run the TestBlas routine from the following tutorial provided by Intel:
https://software.intel.com/sites/default/files/managed/11/1c/Xeon%20Phi%20Support%20in%20Matlab_v2.pdf
But I have to say that as I have one 5110P Xeon Phi (compared to 2 more performant Xeon Phi in the pdf) it is faster by 0,2 sec (4,5sec in total) to compute on the Xeon E5-2660v2 10 cores processor than on the Phi. It is faster in case I compare the offload on PHI to usual matlab on CPU toguether with the "-singleCompThread" option (12 sec). We usually use that option as we have a multiuser server.

Sign in to comment.

Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!