Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

CUDA missing library libcuda.so.1

Asked by Ivan on 19 Nov 2012

Hi, im trying to use GPU but when i try to run some command a get this error:

Error using gpuDevice (line 26)
There is a problem with the CUDA driver associated with this GPU device. See
www.mathworks.com/gpudriver to find and install the latest supported driver.
Caused by:
    The CUDA driver could not be loaded. The library name used was: libcuda.so.1. The error was:
    libcuda.so.1: cannot open shared object file: No such file or directory.

My Matlab client is on server called "mdcstest" and the GPU on another called "comp01", im using Matlab Job Scheduler. Normal task, matlabpool is running without problem. On "comp01" i have cuda 5.0.

What i need to do, to use the GPU on "comp01" ?

Thanks.

7 Comments

Thomas Ibbotson on 21 Nov 2012

Ok, just to double check that the code is running on the correct machine can you run:

mjs = parcluster
mjs.matlabpool
spmd, gpuDevice, end
matlabpool close

and paste the results here?

Ivan on 22 Nov 2012

I got this:

mjs = 
   MJS Cluster Information
   =======================
                           Profile: MJS
                          Modified: false
                              Host: comp01
                        NumWorkers: 4
                JobStorageLocation: Database on comp01
                 ClusterMatlabRoot: /shared/software/MATLAB
                   OperatingSystem: unix
   - Assigned Jobs
                    Number Pending: 0
                     Number Queued: 0
                    Number Running: 0
                   Number Finished: 0
   - MJS Specific Properties
                              Name: test
                  AllHostAddresses: 147.232.116.7
                                    fe80:0:0:0:5ef3:fcff:fea9:18d4%3
                                    fe80:0:0:0:200:c9ff:fecd:a66c%5
                    NumBusyWorkers: 0
                    NumIdleWorkers: 4
                          Username: durkac
                     SecurityLevel: 0 (No security)
            HasSecureCommunication: false
Starting matlabpool using the 'MJS' profile ... connected to 1 labs.
Lab 1: 
    ans = 
      parallel.gpu.CUDADevice handle
      Package: parallel.gpu
      Properties:
                          Name: 'Tesla M2070'
                         Index: 1
             ComputeCapability: '2.0'
                SupportsDouble: 1
                 DriverVersion: 5
            MaxThreadsPerBlock: 1024
              MaxShmemPerBlock: 49152
            MaxThreadBlockSize: [1024 1024 64]
                   MaxGridSize: [65535 65535]
                     SIMDWidth: 32
                   TotalMemory: 5.6366e+09
                    FreeMemory: 5.3198e+09
           MultiprocessorCount: 14
                  ClockRateKHz: 1147000
                   ComputeMode: 'Default'
          GPUOverlapsTransfers: 1
        KernelExecutionTimeout: 0
              CanMapHostMemory: 1
               DeviceSupported: 1
                DeviceSelected: 1
Sending a stop signal to all the labs ... stopped.

I didnt use smpd..end, when i was trying to use GPU, that was the problem i thing, or?

Ben Tordoff on 24 Dec 2012

If you didn't put the gpuDevice call inside SPMD then it will run on your client machine (which presumably does not have a GPU or the CUDA drivers). Putting it inside SPMD causes it to run on the worker(s). Equally, trying to use the GPU inside a PARFOR or from within a task function would probably have worked fine as that also happens on the worker.

I believe Thomas's answer below is the correct one.

Ivan

2 Answers

Answer by Jason Ross on 19 Nov 2012
Edited by Jason Ross on 19 Nov 2012

It sounds like the GPU driver is not installed correctly on comp01. Perhaps you installed the SDK and toolkit and not the driver?

nVidia ships a utility called "nvidia-smi" (by default, in /usr/bin) that will list all the installed GPUs in a system. I'm betting you'll get the same error at the command line as you do in MATLAB if you run nvidia-smi. If things are working properly, you should see something like the following:

% nvidia-smi
Mon Nov 19 15:33:41 2012       
+------------------------------------------------------+                       
| NVIDIA-SMI 4.304.54   Driver Version: 304.54         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name                     | Bus-Id        Disp.  | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap| Memory-Usage         | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro FX 370            | 0000:01:00.0     N/A |                  N/A |
| 60%   60C  N/A     N/A /  N/A |   4%   11MB /  255MB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla C1060              | 0000:08:00.0     Off |                  N/A |
| 35%   56C    P8    N/A /  N/A |   0%    3MB / 4095MB |      0%      Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0            Not Supported                                               |
+-----------------------------------------------------------------------------+

3 Comments

Ivan on 20 Nov 2012

I tried what you suggested and i got this:

nvidia-smi
Tue Nov 20 07:22:17 2012
+------------------------------------------------------+
| NVIDIA-SMI 4.304.54   Driver Version: 304.54         |
|-------------------------------+----------------------+----------------------+
| GPU  Name                     | Bus-Id        Disp.  | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap| Memory-Usage         | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla M2070              | 0000:15:00.0     Off |                    0 |
| N/A   N/A    P0    N/A /  N/A |   2%   84MB / 5375MB |      0%      Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0     17291  /shared/software/MATLAB/bin/glnxa64/MATLAB            72MB  |
+-----------------------------------------------------------------------------+

So the driver is installed properly, or?

Edric Ellis on 22 Nov 2012

As you can see, that nvidia-smi output shows that MATLAB is already accessing the GPU. Perhaps the device is in exclusive mode? (Can I also just confirm that you ran this command on the cluster node "comp01"). If you run

nvidia-smi -q

You should be able to see what compute mode the device is in.

Ivan on 22 Nov 2012

I got this:

==============NVSMI LOG==============
Timestamp                       : Thu Nov 22 13:47:21 2012
Driver Version                  : 304.54
Attached GPUs                   : 1
GPU 0000:15:00.0
    Product Name                : Tesla M2070
    Display Mode                : Disabled
    Persistence Mode            : Disabled
    Driver Model
        Current                 : N/A
        Pending                 : N/A
    Serial Number               : 0323111075641
    GPU UUID                    : GPU-e4f9d738-18d5-866c-f4f0-268427c2d877
    VBIOS Version               : 70.00.3E.00.03
    Inforom Version
        Image Version           : N/A
        OEM Object              : 1.0
        ECC Object              : 1.0
        Power Management Object : 1.0
    GPU Operation Mode
        Current                 : N/A
        Pending                 : N/A
    PCI
        Bus                     : 0x15
        Device                  : 0x00
        Domain                  : 0x0000
        Device Id               : 0x06D210DE
        Bus Id                  : 0000:15:00.0
        Sub System Id           : 0x083010DE
        GPU Link Info
            PCIe Generation
                Max             : 2
                Current         : 2
            Link Width
                Max             : 16x
                Current         : 16x
    Fan Speed                   : N/A
    Performance State           : P0
    Clocks Throttle Reasons     : N/A
    Memory Usage
        Total                   : 5375 MB
        Used                    : 3007 MB
        Free                    : 2368 MB
    Compute Mode                : Default
    Utilization
        Gpu                     : 35 %
        Memory                  : 10 %
    Ecc Mode
        Current                 : Enabled
        Pending                 : Enabled
    ECC Errors
        Volatile
            Single Bit
                Device Memory   : 0
                Register File   : 0
                L1 Cache        : 0
                L2 Cache        : 0
                Texture Memory  : N/A
                Total           : 0
            Double Bit
                Device Memory   : 0
                Register File   : 0
                L1 Cache        : 0
                L2 Cache        : 0
                Texture Memory  : N/A
                Total           : 0
        Aggregate
            Single Bit
                Device Memory   : N/A
                Register File   : N/A
                L1 Cache        : N/A
                L2 Cache        : N/A
                Texture Memory  : N/A
                Total           : 0
            Double Bit
                Device Memory   : N/A
                Register File   : N/A
                L1 Cache        : N/A
                L2 Cache        : N/A
                Texture Memory  : N/A
                Total           : 0
    Temperature
        Gpu                     : N/A
    Power Readings
        Power Management        : N/A
        Power Draw              : N/A
        Power Limit             : N/A
        Default Power Limit     : N/A
        Min Power Limit         : N/A
        Max Power Limit         : N/A
    Clocks
        Graphics                : 573 MHz
        SM                      : 1147 MHz
        Memory                  : 1566 MHz
    Applications Clocks
        Graphics                : N/A
        Memory                  : N/A
    Max Clocks
        Graphics                : 573 MHz
        SM                      : 1147 MHz
        Memory                  : 1566 MHz
    Compute Processes
        Process ID              : 10911
            Name                : /shared/software/MATLAB/bin/glnxa64/MATLAB
            Used GPU Memory     : 453 MB
        Process ID              : 11137
            Name                : /shared/software/MATLAB/bin/glnxa64/MATLAB
            Used GPU Memory     : 1041 MB
        Process ID              : 10803
            Name                : /shared/software/MATLAB/bin/glnxa64/MATLAB
            Used GPU Memory     : 453 MB
        Process ID              : 11026
            Name                : /shared/software/MATLAB/bin/glnxa64/MATLAB
            Used GPU Memory     : 1041 MB
Jason Ross
Answer by Thomas Ibbotson on 23 Nov 2012

When you open a matlabpool and use 'spmd', the code in that block is run on all the workers in the pool. As your workers are running on 'comp01', which has the GPU, this means that the GPU code will be able to run. Without the 'spmd' any code you run will run on your local machine (in your case this did not have the GPU driver and a supported GPU and it failed.)

Note that spmd is not the only way to run code on the cluster, you can also use the 'batch' function. In this case you give 'batch' the name of a script you want to run, and that will run on one of the workers on the cluster. For example:

mjs = parcluster;
job = mjs.batch('myScript');
wait(job);
load(job);

Where 'myScript' has the code you want to run on the GPU on 'comp01'.

For more information about 'batch' see the batch processing documentation, and for 'spmd' see the spmd documentation.

0 Comments

Thomas Ibbotson

Contact us