Troubleshooting

This section offers advice on solving problems you might encounter with MATLAB® Distributed Computing Server™ software.

License Errors

When starting a MATLAB® worker, a licensing problem might result in the message

License checkout failed. No such FEATURE exists. 
License Manager Error -5

There are many reasons why you might receive this error:

Verifying Multicast Communications

Multicast, unlike TCP/IP or UDP, is a subscription-based protocol where a number of machines on a network indicate to the network their interest in particular packets originating somewhere on that network. By contrast, both UDP and TCP packets are always bound for a single machine, usually indicated by its IP address.

The main tools for investigating this type of packet are tcpdump or the equivalent on Microsoft® Windows® operating systems (usually called winpcap and ethereal), and a Java™ class included with Version 3 of the parallel computing products.

The class is called com.mathworks.toolbox.distcomp.test.MulticastTester. Both its static main method and its constructor take two input arguments: the multicast group to join and the port number to use.

This Java class has a number of simple methods to attempt to join a specified multicast group. Once the class has successfully joined the group, it has methods to send messages to the group, listen for messages from the group, and display what it receives. The class can be used both inside MATLAB and from a call to Java software.

Inside MATLAB, the class would be used as follows.

m = com.mathworks.toolbox.distcomp.test.MulticastTester('239.1.1.1', 9999);
m.startSendingThread;
m.startListeningThread;
 0 : host1name : 0
 1 : host2name : 0

From a shell prompt, you would type (assuming that java is on your path)

java -cp distcomp.jar com.mathworks.toolbox.distcomp.test.MulticastTester 
0 : host1name : 0
1 : host2name : 0

Memory Errors on UNIX® Operating Systems

If the number of threads created by the server services on a UNIX®-based machine exceeds the limitation set by the maxproc value, the services will fail and generate an out-of-memory error. You can check your maxproc value on a UNIX-based system with the limit command. (Different versions of UNIX software might have different names for this property instead of maxproc, such as descriptors on Solaris™ operating systems.)

Running Server Processes from a Windows®-Based Network Installation

Many networks are configured not to allow LocalSystem to have access to UNC or mapped network shares. In this case, run the mdce process under a different user with rights to log on as a service. See Setting the User.

Required Ports

Using a Job Manager

BASE_PORT.   The ports required by the job manager and all workers are specified and described in the mdce_def file. See the following file in the MATLAB installation used for each cluster process:

Parallel Jobs.   The range of ports on UNIX-based worker machines required by MPICH for the running of parallel jobs is from BASEPORT + 1000 up to BASE_PORT + 2000.

Using a Third-Party Scheduler

Before the worker processes start, you can control the range of ports used by the workers for parallel jobs by defining the environment variable MPICH_PORT_RANGE with the value minport:maxport.

Client Ports

You can specify the ports used by the client with the pctconfig function. With this function you can ports separately for communication with the job manager and communication with pmode or a matlabpool, if the default ports cannot be used.

Ephemeral TCP Ports with Job Manager

If you use the job manager on a Windows-based cluster, you must make sure that a large number of ephemeral TCP ports are available on the job manager machine. By default, the maximum valid ephemeral TCP port number on a Windows operating system is 5000, but transfers of large data sets might fail if this setting is not increased. In particular, if your cluster has 32 or more workers, you should increase the maximum valid ephemeral TCP port number, with the following procedure:

  1. Start the Registry Editor.

  2. Locate the following subkey in the registry, and then click Parameters:

    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
  3. On the Edit menu, click New, and then add the following registry entry:

    Value Name: MaxUserPort
    Value Type: DWORD
    Value data: 65534
    Valid Range: 5000-65534 (decimal)
    Default: 0x1388 (5000 decimal)
    Description: This parameter controls the maximum port number that is 
    used when a program requests any available user port from the system. 
    Typically , ephemeral (short-lived) ports are allocated between the 
    values of 1024 and 5000 inclusive.
  4. Quit Registry Editor.

  5. Reboot your machine.

  


 © 1984-2008- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS