Documentation Center

  • Trial Software
  • Product Updates

Configure for an MJS

Configure Cluster to Use a MATLAB Job Scheduler (MJS)

The mdce service must be running on all machines being used for MATLAB® job schedulers (MJS) or workers. This service manages the MJS and worker processes. One of the major tasks of the mdce service is to recover the MJS and worker sessions after a system crash, so that jobs and tasks are not lost as a result of such accidents.

The following figure shows the processes that run on your cluster nodes.

    Note   The MATLAB job scheduler (MJS) was formerly known as the MathWorks job manager. The process is the same, is started in the same way, and performs the same functions.

In the following instructions, matlabroot refers to the location of your installed MATLAB Distributed Computing Server™ software. Where you see this term used in the instructions that follow, substitute the path to your location.

Step 1: Set Up Windows Cluster Hosts

If this is the first installation of MATLAB Distributed Computing Server on a cluster of Windows machines, you need to configure these hosts for job communications.

    Note   If you do not have a Windows cluster, or if you have already installed a previous version of MATLAB Distributed Computing Server on your Windows cluster, you can skip this step and proceed to Step 2.

Configure Windows Firewalls.  If you are using Windows® firewalls on your cluster nodes,

  1. Log in as a user with administrator privileges.

  2. Execute the following in a DOS command window.

    matlabroot\toolbox\distcomp\bin\addMatlabToWindowsFirewall.bat
    

    This command adds MATLAB as an allowed program. If you are using other firewalls, you must configure them for similar accommodation.

Configure User Access to Installation.  The user that mdce runs as requires access to the cluster MATLAB installation location. By default, mdce runs as the user LocalSystem. If your network allows LocalSystem to access the install location, you can proceed to the next step. (If you are not sure of your network configuration and the access provided for LocalSystem, contact the MathWorks install support team.)

    Note   If LocalSystem cannot access the install location, you must run mdce as a different user.

You can set a different user with these steps:

  1. With any standard text editor (such as WordPad) open the mdce_def file found at:

    matlabroot\toolbox\distcomp\bin\mdce_def.bat
  2. Find the line for setting the MDCEUSER parameter, and provide a value in the form domain\username:

    set MDCEUSER=mydomain\myusername
  3. Provide the user password by setting the MDCEPASS parameter:

    set MDCEPASS=password
  4. Save the file. Proceed to the next step.

Step 2: Stop mdce Services of Old Installation

If you have an older version of MATLAB Distributed Computing Server running on your cluster nodes, you should stop the mdce services before starting the services of the new installation.

Stop mdce on Windows.  If this is your first installation of the parallel computing products, proceed to Step 3: Start the mdce Service, MJS, and Workers.

  1. Open a DOS command window with the necessary privileges:

    1. If you are using Windows 7 or Windows Vista™, you must run the command window with administrator privileges. Click the Windows menu Start > (All) Programs > Accessories; then right-click Command Window, and select Run as Administrator. This option is available only if you are running User Account Control (UAC).

    2. If you are using Windows XP, open a DOS command window by selecting the Windows menu Start > Run, then in the Open field, type

      cmd
      
  2. In the command window, navigate to the folder of the old installation that contains the control scripts.

    cd oldmatlabroot\toolbox\distcomp\bin
    
  3. Stop and uninstall the old mdce service and remove its associated files by typing the command:

    mdce uninstall -clean
    

      Note   Using the -clean flag permanently removes all existing job data. Be sure this data is no longer needed before removing it.

  4. Repeat the instructions of this step on all worker nodes.

Stop mdce on UNIX.  

  1. Log in as root. (If you cannot log in as root, you must alter the following parameters in the matlabroot/toolbox/distcomp/bin/mdce_def.sh file to point to a folder for which you have write privileges: CHECKPOINTBASE, LOGBASE, PIDBASE, and LOCKBASE if applicable.)

  2. On each cluster node, stop the mdce service and remove its associated files by typing the commands:

    cd oldmatlabroot/toolbox/distcomp/bin
    ./mdce stop -clean

      Note   Using the -clean flag permanently removes all existing job data. Be sure this data is no longer needed before removing it.

Step 3: Start the mdce Service, MJS, and Workers

You can start the MJS (job manager) by using a GUI or the command line. Choose one:

Using Admin Center GUI.  

  1. Identify Hosts and Start the mdce Service

    1. To open Admin Center, navigate to the folder:

      matlabroot\toolbox\distcomp\bin ( on Windows)

      matlabroot/toolbox/distcomp/bin ( on UNIX)

      Then execute the file:

      admincenter.bat (on Windows)

      admincenter (on UNIX)

        Note   To start the mdce service on remote machines from Admin Center, requires that you run Admin Center as a user who has administrator privileges on all the machines.

      If there are no past sessions of Admin Center saved for you, the GUI opens with a blank listing, superimposed by a welcome dialog box, which provides information on how to get started.

    2. Click Add or Find.

      The Add or Find Hosts dialog box opens.

    3. Select Enter Hostnames, then list your hosts in the text box. You can use short host names, fully qualified domain names, or individual IP addresses. The following figure shows an example using host names node1, node2, node3, and node4. In your case, use your own host names.

      Keep the check to start mdce service.

    4. Click OK to open the Start mdce service dialog box. Proceed through the steps clicking Next and checking the settings at each step. For most settings, the default is appropriate.

      It might take a moment for Admin Center to communicate with all the nodes, start the services, and acquire the status of all of them. When Admin Center completes the update, the listing should look something like the following figure.

    5. At this point, you should test the connectivity between the nodes. This assures that your cluster can perform the necessary communications for running other MCDS processes.

      In the Hosts module, click Test Connectivity.

    6. When the Connectivity Testing dialog box opens, it shows the results of the last test, if there are any. Click Run to run the tests and generate new data.

      If any of the connectivity tests fail, double-click the icon that indicates a failure to get information about that specific test; or use the Log tab to get all test results. With this information, you can refer to Troubleshoot Common Problems. If you need further help, contact the MathWorks install support team.

      .

    7. If your tests pass, click Close to return to the Admin Center GUI.

  2. Start the MJS

    1. To start an MJS (job manager), click Start in the MJS module. (This is one of several ways to open the New MJS dialog box.) In the New MJS dialog box, specify a name and host for your MJS. This example shows an MJS called MyMJS to run on host node1.

    2. Click OK to start the MJS and return to the Admin Center GUI.

  3. Start the Workers

    1. To start workers, click Start in the Workers module. (This is one of several ways to open the Start Workers dialog box.)

    2. In the Start Workers dialog box, specify the number of workers to start on each host. The number is up to you, but you cannot exceed the total number of licenses you have. A good starting value might be to start one worker per computational core on your hosts.

    3. Select the hosts to start the workers on. Click Select All if you want to start workers on all listed hosts.

    4. Select the MJS for your workers. If you have only one MJS running in this Admin Center session, that is the default.

      The following example shows a setup for starting eight workers on four hosts (two workers each). Your names and numbers will vary.

    5. Click OK to start the workers and return to the Admin Center dialog box. It might take a moment for Admin Center to initialize all the workers and acquire their status.

When all the workers are started, Admin Center looks something like the following figure. If your workers are all idle and connected, your cluster is ready for use.

If you encounter any problems or failures, contact the MathWorks install support team.

For more information about Admin Center functionality, such as stopping processes or saving sessions, see Cluster Processes and Profiles.

Using the Command-Line Interface (Windows).  

  1. Start the mdce Service

    You must install the mdce service on all nodes (head node and worker nodes). Begin on the head node.

    1. Open a DOS command window with the necessary privileges:

      1. If you are using Windows 7 or Windows Vista, you must run the command window with administrator privileges. Click the Windows menu Start > (All) Programs > Accessories; then right-click Command Window, and select Run as Administrator. This option is available only if you are running User Account Control (UAC).

      2. If you are using Windows XP, open a DOS command window by selecting the Windows menu Start > Run, then in the Open field, type:

        cmd
        
    2. In the DOS command window, navigate to the folder with the control scripts:

      cd matlabroot\toolbox\distcomp\bin
      
    3. Install the mdce service by typing the command:

      mdce install
      
    4. Start the mdce service by typing the command:

      mdce start
      
    5. Repeat the instructions of this step on all worker nodes.

    As an alternative to items 3–5, you can install and start the mdce service on several nodes remotely from one machine by typing:

    cd matlabroot\toolbox\distcomp\bin
    remotemdce install -remotehost hostA,hostB,hostC . . .
    remotemdce start -remotehost hostA,hostB,hostC . . .
    

    where hostA,hostB,hostC refers to a list of your host names. Note that there are no spaces between host names, only a comma. If you need to indicate protocol, platform (such as in a mixed environment), or other information, see the help for remotemdce by typing:

    remotemdce -help
    

    Once installed, the mdce service starts running each time the machine reboots. The mdce service continues to run until explicitly stopped or uninstalled, regardless of whether an MJS or worker session is running.

  2. Start the MJS

    To start the MATLAB job scheduler (MJS), enter the following commands in a DOS command window. You do not have to be at the machine on which the MJS runs, as long as you have access to the MDCS installation.

    1. In your DOS command window, navigate to the folder with the startup scripts:

      cd matlabroot\toolbox\distcomp\bin
      
    2. Start the MJS, using any unique text you want for the name <MyMJS>:

      startjobmanager -name <MyMJS> -remotehost <MJS host name> -v
    3. Verify that the MJS is running on the intended host.

      nodestatus -remotehost <MJS host name>
      

        Note   If you are executing startjobmanager on the host where the MJS runs, you do not need to specify the -remotehost flag.

        If you have more than one MJS on your cluster, each must have a unique name.

  3. Start the Workers

      Note   Before you can start a worker on a machine, the mdce service must already be running on that machine, and the license manager for MATLAB Distributed Computing Server must be running on the network.

    For each node used as a worker, enter the following commands in a DOS command window. You do not have to be at the machines where the MATLAB workers will run, as long as you have access to the MDCS installation.

    1. Navigate to the folder with the startup scripts:

      cd matlabroot\toolbox\distcomp\bin
      
    2. Start the workers on each node, using the text for <MyMJS> that identifies the name of the MJS you want this worker registered with. Enter this text on a single line:

      startworker -jobmanagerhost <MJS host name>
          -jobmanager <MyMJS> -remotehost <worker host name> -v
      

      To run more than one worker session on the same node, give each worker a unique name by including the -name option on the startworker command, and run it for each worker on that node:

      startworker ... -name <worker1 name>
      startworker ... -name <worker2 name>
      
    3. Verify that the workers are running.

      nodestatus -remotehost <worker host name>
      
    4. Repeat items 2–3 for all worker nodes.

    For more information about mdce, MJS, and worker processes, such as how to shut them down or customize them, see MJS Cluster Customization.

Using the Command-Line Interface (UNIX).  

  1. Start the mdce Service

    On each cluster node, start the mdce service by typing the commands:

    cd matlabroot/toolbox/distcomp/bin
    ./mdce start

    Alternatively (on Linux, but not Macintosh), you can start the mdce service on several nodes remotely from one machine by typing

    cd matlabroot/toolbox/distcomp/bin
    ./remotemdce start -remotehost hostA,hostB,hostC . . .

    where hostA,hostB,hostC refers to a list of your host names. Note that there are no spaces between host names, only a comma. If you need to indicate protocol, platform (such as in a mixed environment), or other information, see the help for remotemdce by typing

    ./remotemdce -help
  2. Start the MJS

    To start the MATLAB job scheduler (MJS), enter the following commands. You do not have to be at the machine on which the MJS runs, as long as you have access to the MDCS installation.

    1. Navigate to the folder with the startup scripts:

      cd matlabroot/toolbox/distcomp/bin
      
    2. Start the MJS, using any unique text you want for the name <MyMJS>. Enter this text on a single line.

      ./startjobmanager -name <MyMJS> -remotehost <MJS host name> -v
      
    3. Verify that the MJS is running on the intended host:

      ./nodestatus -remotehost <MJS host name>
      

        Note   If you have more than one MJS on your cluster, each must have a unique name.

  3. Start the Workers

      Note   Before you can start a worker on a machine, the mdce service must already be running on that machine, and the license manager for MATLAB Distributed Computing Server must be running on the network.

    For each computer hosting a MATLAB worker, enter the following commands. You do not have to be at the machines where the MATLAB workers run, as long as you have access to the MDCS installation.

    1. Navigate to the folder with the startup scripts:

      cd matlabroot/toolbox/distcomp/bin
      
    2. Start the workers on each node, using the text for <MyMJS> that identifies the name of the MJS you want this worker registered with. Enter this text on a single line:

      ./startworker -jobmanagerhost <MJS host name>
         -jobmanager <MyMJS> -remotehost <worker host name> -v
      

      To run more than one worker session on the same machine, give each worker a unique name with the -name option:

      ./startworker ... -name <worker1>
      ./startworker ... -name <worker2>
      
    3. Verify that the workers are running. Repeat this command for each worker node:

      ./nodestatus -remotehost <worker host name>
      

    For more information about mdce, MJS, and worker processes, such as how to shut them down or customize them, see MJS Cluster Customization.

Step 4: Install the mdce Service to Start Automatically at Boot Time (UNIX)

Although this step is not required, it is helpful in case of a system crash. Once configured for this, the mdce service starts running each time the machine reboots. The mdce service continues to run until explicitly stopped, regardless of whether an MJS or worker session is running.

You must have root privileges to do this step.

Choose your platform:

Debian, Fedora Platforms.  On each cluster node, register the mdce service as a known service and configure it to start automatically at system boot time by following these steps:

  1. Create the following link, if it does not already exist:

    ln -s matlabroot/toolbox/distcomp/bin/mdce /etc/mdce
    
  2. Create the following link to the boot script file:

    ln -s matlabroot/toolbox/distcomp/bin/mdce /etc/init.d/mdce
    
  3. Set the boot script file permissions:

    chmod 555 /etc/init.d/mdce
    
  4. Look in /etc/inittab for the default run level. Create a link in the rc folder associated with that run level. For example, if the run level is 5, execute these commands:

    cd /etc/rc5.d;
    ln -s ../init.d/mdce S99MDCE
    

SUSE Platform.  On each cluster node, register the mdce service as a known service and configure it to start automatically at system boot time by following these steps:

  1. Create the following link, if it does not already exist:

    ln -s matlabroot/toolbox/distcomp/bin/mdce /etc/mdce
    
  2. Create the following link to the boot script file:

    ln -s matlabroot/toolbox/distcomp/bin/mdce /etc/init.d/mdce
    
  3. Set the boot script file permissions:

    chmod 555 /etc/init.d/mdce
    
  4. Look in /etc/inittab for the default run level. Create a link in the rc folder associated with that run level. For example, if the run level is 5, execute these commands:

    cd /etc/init.d/rc5.d;
    ln -s ../mdce S99MDCE
    

Red Hat Platform (non-Fedora).  On each cluster node, register the mdce service as a known service and configure it to start automatically at system boot time by following these steps:

  1. Create the following link, if it does not already exist:

    ln -s matlabroot/toolbox/distcomp/bin/mdce /etc/mdce
    
  2. Create the following link to the boot script file:

    ln -s matlabroot/toolbox/distcomp/bin/mdce /etc/init.d/mdce
    
  3. Set boot script file permissions:

    chmod 555 /etc/init.d/mdce
    
  4. Look in /etc/inittab for the default run level. Create a link in the rc folder associated with that run level. For example, if the run level is 5, execute these commands:

    cd /etc/rc.d/rc5.d;
    ln -s ../../init.d/mdce S99MDCE
    

Macintosh Platform.  On each cluster node, register the mdce service as a known service with launchd, and configure it to start automatically at system boot time by following these steps:

  1. Navigate to the toolbox folder and stop the running mdce service:

    cd matlabroot/toolbox/distcomp/bin
    sudo ./mdce stop
    
  2. Create the following link if it does not already exist:

    sudo ln -s matlabroot/toolbox/distcomp/bin/mdce /usr/sbin/mdce
    
  3. Copy the launchd .plist file for mdce to /Library/LaunchDaemons:

    sudo cp ./util/com.mathworks.mdce.plist /Library/LaunchDaemons
    
  4. Start mdce and observe that it starts inside launchd:

    sudo ./mdce start
    

    The command output should read:

    Starting the MATLAB Distributed Computing Server using launchctl.
    

Configure Windows Firewalls on Client

If you are using Windows firewalls on your client node,

  1. Log in as a user with administrative privileges.

  2. Execute the following in a DOS command window.

    matlabroot\toolbox\distcomp\bin\addMatlabToWindowsFirewall.bat
    

    This command adds MATLAB as an allowed program. If you are using other firewalls, you must configure them for similar accommodation.

Validate Installation with MJS

This procedure verifies that your parallel computing products are installed and configured correctly.

Step 1: Verify the Cluster Connection

To verify the network connection from the client computer to the MJS computer, follow these instructions.

    Note   In these instructions, matlabroot refers to the folder where MATLAB is installed on the client computer. Do not confuse this with the MDCS cluster computers.

  1. On the client computer where Parallel Computing Toolbox™ is installed, open a DOS command window (for Windows software) or a shell (for UNIX® software) and go to the control script folder.

    cd matlabroot\toolbox\distcomp\bin (for Windows)
    cd matlabroot/toolbox/distcomp/bin (for UNIX)
    
  2. Run nodestatus to verify your cluster communications. Substitute <MJS Host> with the host name of your MJS computer.

    nodestatus -remotehost <MJS Host>
    

    If successful, you should see the status of your MJS (job manager) and its workers. Otherwise, refer to Troubleshoot Common Problems.

Step 2: Define a Cluster Profile

In this step you define a cluster profile to use in subsequent steps.

  1. Start the Cluster Profile Manager from the MATLAB desktop by selecting on the Home tab in the Environment area Parallel > Manage Cluster Profiles.

  2. Create a new profile in the Cluster Profile Manager by selecting New > MATLAB Job Scheduler (MJS).

  3. With the new profile selected in the list, click Rename and edit the profile name to be MJStest. Press Enter.

  4. In the Properties tab, provide settings for the following fields:

    1. Set the Description field to For testing installation with MJS.

    2. Set the Host field to the name of the host on which your MJS is running. Depending on your network, this might be only a host name, or it might have to be a fully qualified domain name.

    3. Set the MJSName field to the name of your MJS, which you started earlier.

      So far, the dialog box should look like the following figure:

  5. Click Done to save your cluster profile.

Step 3: Validate the Cluster Profile

In this step you validate your cluster profile, and thereby your installation.

  1. If it is not already open, start the Cluster Profile Manager from the MATLAB desktop by selecting on the Home tab in the Environment area Parallel > Manage Cluster Profiles.

  2. Select your cluster profile in the listing.

  3. Click Validate.

The Validation Results tab shows the output. The following figure shows the results of a profile that passed all validation tests.

    Note   If your validation does not pass, contact the MathWorks install support team.

If your validation passed, you now have a valid profile that you can use in other parallel applications. You can make any modifications to your profile appropriate for your applications, such as NumWorkersRange, AttachedFiles, AdditionalPaths, etc. To save your profile for other users, select the profile and click Export, then save your profile to a file in a convenient location. Later, when running the Cluster Profile Manager, other users can import your profile by clicking Import.

Was this topic helpful?