What do these scripts do?
These Bourne shell scripts allow you to control a MATLAB Distributing Computing
cluster (Version 2) easily. To use them, follow the steps below:
Step 1: Configure ssh
To use these scripts, first set up ssh to be password free for the
user under whom you have chosen the Distributed Computing Engine
daemons (mdce) to run. There are numerous tutorials on how to do this
on the internet. But here are some steps in brief:
(1) Run ssh-keygen -t dsa to generate files. If you choose a non-empty
passphrase, you can use ssh-agent and ssh-add to make life easier
(2) Copy the newly generated file id_dsa.pub to the .ssh directory of
the mdce user on the remote machine and rename the file as "authorized_keys"
Step 2: Edit the configuration files
Then, edit the following files:
More detail for each in turn:
This must contain the root directory of the MATLAB installation on
each cluster node. It must be the same for all nodes to use these
scripts. For example:
$ cat matlabroot
This must contain the string "synchronous" or the string
"asynchronous". If the former, then commands are executed by the
scripts synchonously, or in a queue, with each starting only when the
previous has returned. If the latter, then commands are executed by
the scripts asynchonrously, or simultaneously, with them all running
in the background at the same time. However, the scripts wait until
they have all finished before returning.
The latter mode is faster, but it is easier to see what the scripts
are doing when run in the former mode.
This the main configuration file for the cluster. It is easiest to see
what is going on with an example:
$ cat hosts
MDCE hosts Job Managers Workers
machine1 jm1 jm1,jm1
machine2 - -
machine3 jm2,jm3 -
machine4 - jm2
machine5 - jm2,jm3
The first column consists of the list of machines which form the
The second column consists of job manager names. In the above
example, machine1 will run a job manager called jm1 and machine3 will
run two job managers, called jm2 and jm3. The other machines run no
The third column defines which workers will run, where they run and
to which job managers they will attach. In the example above, two
worker processes will run on machine1 and attach to the job manager
called jm1. On machine4, one worker will run, attached to jm2 and
machine5 will run 2 workers, one attached to jm2 and the other
attached to jm3. Nothing execpt the mdce itself will run on machine2.
Any whitespace can separate the columns but an empty entry must be a
hyphen "-" as in the example.
Step 3: Run the commands
In the directory <matlabroot>/toolbox/distcomp/bin you will find the
In this package, you will find a "distributed" version of each of
these, prefixed with a "d". Namely:
Each of these commands can accept the same command-line arguments as
their non-distributed counterparts, with expection of those that are
defined by the hosts file. These are:
For dstartjobmanagers.sh and dstopjobmanagers.sh, -name and -remotehost.
For dstartworkers.sh and dstopworkers.sh, -name, -jobmanager,
-jobmanagerhost and -remotehost.
To bring up a cluster typical usage might be as follows (output suppressed):
and to take down a cluster:
although just the last line would do the trick. The only other command in
the directory is "dssh" which loops over the list of hosts and runs a
command using ssh on each. The dmdce command uses this to run mdce on
remote hosts, as did the other commands before the -remotehost option
became available. A nice way of checking that your ssh is configured
properly is to run something like:
Another use of ./dssh would be to blow away all the checkpoint
history, usually in /var/lib/mdce:
$./dssh rm -rf /var/lib/mdce
Version 1.1, 2006-04-27
Please send bug reports to email@example.com
Jos Martin (2021). Cluster setup scripts (https://www.mathworks.com/matlabcentral/fileexchange/10895-cluster-setup-scripts), MATLAB Central File Exchange. Retrieved .
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!