MATLAB Answers

CUDA missing library libcuda.so.1

50 views (last 30 days)
Ivan
Ivan on 19 Nov 2012
Commented: rahul srivastava on 14 May 2014
Hi, im trying to use GPU but when i try to run some command a get this error:
Error using gpuDevice (line 26)
There is a problem with the CUDA driver associated with this GPU device. See
www.mathworks.com/gpudriver to find and install the latest supported driver.
Caused by:
The CUDA driver could not be loaded. The library name used was: libcuda.so.1. The error was:
libcuda.so.1: cannot open shared object file: No such file or directory.
My Matlab client is on server called "mdcstest" and the GPU on another called "comp01", im using Matlab Job Scheduler. Normal task, matlabpool is running without problem. On "comp01" i have cuda 5.0.
What i need to do, to use the GPU on "comp01" ?
Thanks.

  8 Comments

Show 5 older comments
Ivan
Ivan on 22 Nov 2012
I got this:
mjs =
MJS Cluster Information
=======================
Profile: MJS
Modified: false
Host: comp01
NumWorkers: 4
JobStorageLocation: Database on comp01
ClusterMatlabRoot: /shared/software/MATLAB
OperatingSystem: unix
- Assigned Jobs
Number Pending: 0
Number Queued: 0
Number Running: 0
Number Finished: 0
- MJS Specific Properties
Name: test
AllHostAddresses: 147.232.116.7
fe80:0:0:0:5ef3:fcff:fea9:18d4%3
fe80:0:0:0:200:c9ff:fecd:a66c%5
NumBusyWorkers: 0
NumIdleWorkers: 4
Username: durkac
SecurityLevel: 0 (No security)
HasSecureCommunication: false
Starting matlabpool using the 'MJS' profile ... connected to 1 labs.
Lab 1:
ans =
parallel.gpu.CUDADevice handle
Package: parallel.gpu
Properties:
Name: 'Tesla M2070'
Index: 1
ComputeCapability: '2.0'
SupportsDouble: 1
DriverVersion: 5
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [65535 65535]
SIMDWidth: 32
TotalMemory: 5.6366e+09
FreeMemory: 5.3198e+09
MultiprocessorCount: 14
ClockRateKHz: 1147000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
Sending a stop signal to all the labs ... stopped.
I didnt use smpd..end, when i was trying to use GPU, that was the problem i thing, or?
Ben Tordoff
Ben Tordoff on 24 Dec 2012
If you didn't put the gpuDevice call inside SPMD then it will run on your client machine (which presumably does not have a GPU or the CUDA drivers). Putting it inside SPMD causes it to run on the worker(s). Equally, trying to use the GPU inside a PARFOR or from within a task function would probably have worked fine as that also happens on the worker.
I believe Thomas's answer below is the correct one.
rahul srivastava
rahul srivastava on 14 May 2014
in my case this error is shown ..nd when i use gpudevice count it is showing 0.please solve my problem > mjs = parcluster mjs.matlabpool spmd, gpuDevice, end matlabpool close
mjs =
Local Cluster Information
=========================
Profile: local
Modified: false
Host: robo-PC
NumWorkers: 2
JobStorageLocation: C:\Users\robo\AppData\Roaming\MathWorks\MATLAB\local_cluster_jobs\R2012b
ClusterMatlabRoot: C:\Program Files\MATLAB\R2012b
OperatingSystem: windows
RequiresMathWorksHostedLicensing: false
- Assigned Jobs
Number Pending: 0
Number Queued: 1
Number Running: 1
Number Finished: 0
Error using parallel.Cluster/matlabpool (line 64) Failed to open matlabpool. (For information in addition to the causing error, validate the profile 'local' in the Cluster Profile Manager.)
Caused by: Error using distcomp.interactiveclient/start (line 11) Found an interactive session. You cannot have multiple interactive sessions open simultaneously. To terminate the existing session, use 'matlabpool close'.
>>

Sign in to comment.

Answers (2)

Jason Ross
Jason Ross on 19 Nov 2012
Edited: Jason Ross on 19 Nov 2012
It sounds like the GPU driver is not installed correctly on comp01. Perhaps you installed the SDK and toolkit and not the driver?
nVidia ships a utility called "nvidia-smi" (by default, in /usr/bin) that will list all the installed GPUs in a system. I'm betting you'll get the same error at the command line as you do in MATLAB if you run nvidia-smi. If things are working properly, you should see something like the following:
% nvidia-smi
Mon Nov 19 15:33:41 2012
+------------------------------------------------------+
| NVIDIA-SMI 4.304.54 Driver Version: 304.54 |
|-------------------------------+----------------------+----------------------+
| GPU Name | Bus-Id Disp. | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro FX 370 | 0000:01:00.0 N/A | N/A |
| 60% 60C N/A N/A / N/A | 4% 11MB / 255MB | N/A Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla C1060 | 0000:08:00.0 Off | N/A |
| 35% 56C P8 N/A / N/A | 0% 3MB / 4095MB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+

  3 Comments

Ivan
Ivan on 20 Nov 2012
I tried what you suggested and i got this:
nvidia-smi
Tue Nov 20 07:22:17 2012
+------------------------------------------------------+
| NVIDIA-SMI 4.304.54 Driver Version: 304.54 |
|-------------------------------+----------------------+----------------------+
| GPU Name | Bus-Id Disp. | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M2070 | 0000:15:00.0 Off | 0 |
| N/A N/A P0 N/A / N/A | 2% 84MB / 5375MB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| 0 17291 /shared/software/MATLAB/bin/glnxa64/MATLAB 72MB |
+-----------------------------------------------------------------------------+
So the driver is installed properly, or?
Edric Ellis
Edric Ellis on 22 Nov 2012
As you can see, that nvidia-smi output shows that MATLAB is already accessing the GPU. Perhaps the device is in exclusive mode? (Can I also just confirm that you ran this command on the cluster node "comp01"). If you run
nvidia-smi -q
You should be able to see what compute mode the device is in.
Ivan
Ivan on 22 Nov 2012
I got this:
==============NVSMI LOG==============
Timestamp : Thu Nov 22 13:47:21 2012
Driver Version : 304.54
Attached GPUs : 1
GPU 0000:15:00.0
Product Name : Tesla M2070
Display Mode : Disabled
Persistence Mode : Disabled
Driver Model
Current : N/A
Pending : N/A
Serial Number : 0323111075641
GPU UUID : GPU-e4f9d738-18d5-866c-f4f0-268427c2d877
VBIOS Version : 70.00.3E.00.03
Inforom Version
Image Version : N/A
OEM Object : 1.0
ECC Object : 1.0
Power Management Object : 1.0
GPU Operation Mode
Current : N/A
Pending : N/A
PCI
Bus : 0x15
Device : 0x00
Domain : 0x0000
Device Id : 0x06D210DE
Bus Id : 0000:15:00.0
Sub System Id : 0x083010DE
GPU Link Info
PCIe Generation
Max : 2
Current : 2
Link Width
Max : 16x
Current : 16x
Fan Speed : N/A
Performance State : P0
Clocks Throttle Reasons : N/A
Memory Usage
Total : 5375 MB
Used : 3007 MB
Free : 2368 MB
Compute Mode : Default
Utilization
Gpu : 35 %
Memory : 10 %
Ecc Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
Single Bit
Device Memory : 0
Register File : 0
L1 Cache : 0
L2 Cache : 0
Texture Memory : N/A
Total : 0
Double Bit
Device Memory : 0
Register File : 0
L1 Cache : 0
L2 Cache : 0
Texture Memory : N/A
Total : 0
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : 0
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : 0
Temperature
Gpu : N/A
Power Readings
Power Management : N/A
Power Draw : N/A
Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 573 MHz
SM : 1147 MHz
Memory : 1566 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : 573 MHz
SM : 1147 MHz
Memory : 1566 MHz
Compute Processes
Process ID : 10911
Name : /shared/software/MATLAB/bin/glnxa64/MATLAB
Used GPU Memory : 453 MB
Process ID : 11137
Name : /shared/software/MATLAB/bin/glnxa64/MATLAB
Used GPU Memory : 1041 MB
Process ID : 10803
Name : /shared/software/MATLAB/bin/glnxa64/MATLAB
Used GPU Memory : 453 MB
Process ID : 11026
Name : /shared/software/MATLAB/bin/glnxa64/MATLAB
Used GPU Memory : 1041 MB

Sign in to comment.


Thomas Ibbotson
Thomas Ibbotson on 23 Nov 2012
When you open a matlabpool and use 'spmd', the code in that block is run on all the workers in the pool. As your workers are running on 'comp01', which has the GPU, this means that the GPU code will be able to run. Without the 'spmd' any code you run will run on your local machine (in your case this did not have the GPU driver and a supported GPU and it failed.)
Note that spmd is not the only way to run code on the cluster, you can also use the 'batch' function. In this case you give 'batch' the name of a script you want to run, and that will run on one of the workers on the cluster. For example:
mjs = parcluster;
job = mjs.batch('myScript');
wait(job);
load(job);
Where 'myScript' has the code you want to run on the GPU on 'comp01'.
For more information about 'batch' see the batch processing documentation, and for 'spmd' see the spmd documentation.

  0 Comments

Sign in to comment.

Sign in to answer this question.