| Products & Services | Solutions | Academia | Support | User Community | Company |
| Download Product Updates | | | Get Pricing | | | Trial Software |
| Documentation → Parallel Computing Toolbox |
| Contents | Index |
| Learn more about Parallel Computing Toolbox |
| On this page… |
|---|
Example — Writing the Submit Function Example — Writing the Decode Function Example — Programming and Running a Job in the Client |
Parallel Computing Toolbox software provides a generic interface that lets you interact with third-party schedulers, or use your own scripts for distributing tasks to other nodes on the cluster for evaluation.
Because each job in your application is comprised of several tasks, the purpose of your scheduler is to allocate a cluster node for the evaluation of each task, or to distribute each task to a cluster node. The scheduler starts remote MATLAB worker sessions on the cluster nodes to evaluate individual tasks of the job. To evaluate its task, a MATLAB worker session needs access to certain information, such as where to find the job and task data. The generic scheduler interface provides a means of getting tasks from your Parallel Computing Toolbox client session to your scheduler and thereby to your cluster nodes.
To evaluate a task, a worker requires five parameters that you must pass from the client to the worker. The parameters can be passed any way you want to transfer them, but because a particular one must be an environment variable, the examples in this section pass all parameters as environment variables.

Note Whereas a MathWorks job manager keeps MATLAB workers running between tasks, a third-party scheduler runs MATLAB workers for only as long as it takes each worker to evaluate its one task. |
When you submit a job to a scheduler, the function identified by the scheduler object's SubmitFcn property executes in the MATLAB client session. You set the scheduler's SubmitFcn property to identify the submit function and any arguments you might want to send to it. For example, to use a submit function called mysubmitfunc, you set the property with the command
set(sched, 'SubmitFcn', @mysubmitfunc)
where sched is the scheduler object in the client session, created with the findResource function. In this case, the submit function gets called with its three default arguments: scheduler, job, and properties object, in that order. The function declaration line of the function might look like this:
function mysubmitfunc(scheduler, job, props)
Inside the function of this example, the three argument objects are known as scheduler, job, and props.
You can write a submit function that accepts more than the three default arguments, and then pass those extra arguments by including them in the definition of the SubmitFcn property.
time_limit = 300
testlocation = 'Plant30'
set(sched, 'SubmitFcn', {@mysubmitfunc, time_limit, testlocation})In this example, the submit function requires five arguments: the three defaults, along with the numeric value of time_limit and the string value of testlocation. The function's declaration line might look like this:
function mysubmitfunc(scheduler, job, props, localtimeout, plant)
The following discussion focuses primarily on the minimum requirements of the submit and decode functions.
This submit function has three main purposes:
To identify the decode function that MATLAB workers run when they start
To make information about job and task data locations available to the workers via their decode function
To instruct your scheduler how to start a MATLAB worker on the cluster for each task of your job

The client's submit function and the worker's decode function work together as a pair. Therefore, the submit function must identify its corresponding decode function. The submit function does this by setting the environment variable MDCE_DECODE_FUNCTION. The value of this variable is a string identifying the name of the decode function on the path of the MATLAB worker. Neither the decode function itself nor its name can be passed to the worker in a job or task property; the file must already exist before the worker starts. For more information on the decode function, see MATLAB Worker Decode Function.
The third input argument (after scheduler and job) to the submit function is the object with the properties listed in the following table.
You do not set the values of any of these properties. They are automatically set by the toolbox so that you can program your submit function to forward them to the worker nodes.
Property Name | Description |
|---|---|
StorageConstructor | String. Used internally to indicate that a file system is used to contain job and task data. |
StorageLocation | String. Derived from the scheduler DataLocation property. |
JobLocation | String. Indicates where this job's data is stored. |
TaskLocations | Cell array. Indicates where each task's data is stored. Each element of this array is passed to a separate worker. |
NumberOfTasks | Double. Indicates the number of tasks in the job. You do not need to pass this value to the worker, but you can use it within your submit function. |
With these values passed into your submit function, the function can pass them to the worker nodes by any of several means. However, because the name of the decode function must be passed as an environment variable, the examples that follow pass all the other necessary property values also as environment variables.
The submit function writes the values of these object properties out to environment variables with the setenv function.
The submit function must define the command necessary for your scheduler to start MATLAB workers. The actual command is specific to your scheduler and network configuration. The commands for some popular schedulers are listed in the following table. This table also indicates whether or not the scheduler automatically passes environment variables with its submission. If not, your command to the scheduler must accommodate these variables.
Scheduler | Scheduler Command | Passes Environment Variables |
|---|---|---|
Condor® | condor_submit | Not by default. Command can pass all or specific variables. |
LSF | bsub | Yes, by default. |
PBS | qsub | Command must specify which variables to pass. |
Sun™ Grid Engine | qsub | Command must specify which variables to pass. |
Your submit function might also use some of these properties and others when constructing and invoking your scheduler command. scheduler, job, and props (so named only for this example) refer to the first three arguments to the submit function.
Argument Object | Property |
|---|---|
scheduler | |
scheduler | |
job | |
job | |
props | NumberOfTasks |
The submit function in this example uses environment variables to pass the necessary information to the worker nodes. Each step below indicates the lines of code you add to your submit function.
Create the function declaration. There are three objects automatically passed into the submit function as its first three input arguments: the scheduler object, the job object, and the props object.
function mysubmitfunc(scheduler, job, props)
This example function uses only the three default arguments. You can have additional arguments passed into your submit function, as discussed in MATLAB Client Submit Function.
Identify the values you want to send to your environment variables. For convenience, you define local variables for use in this function.
decodeFcn = 'mydecodefunc'; jobLocation = get(props, 'JobLocation'); taskLocations = get(props, 'TaskLocations'); %This is a cell array storageLocation = get(props, 'StorageLocation'); storageConstructor = get(props, 'StorageConstructor');
The name of the decode function that must be available on the MATLAB worker path is mydecodefunc.
Set the environment variables, other than the task locations. All the MATLAB workers use these values when evaluating tasks of the job.
setenv('MDCE_DECODE_FUNCTION', decodeFcn);
setenv('MDCE_JOB_LOCATION', jobLocation);
setenv('MDCE_STORAGE_LOCATION', storageLocation);
setenv('MDCE_STORAGE_CONSTRUCTOR', storageConstructor);Your submit function can use any names you choose for the environment variables, with the exception of MDCE_DECODE_FUNCTION; the MATLAB worker looks for its decode function identified by this variable. If you use alternative names for the other environment variables, be sure that the corresponding decode function also uses your alternative variable names.
Set the task-specific variables and scheduler commands. This is where you instruct your scheduler to start MATLAB workers for each task.
for i = 1:props.NumberOfTasks
setenv('MDCE_TASK_LOCATION', taskLocations{i});
constructSchedulerCommand;
endThe line constructSchedulerCommand represents the code you write to construct and execute your scheduler's submit command. This command is typically a string that combines the scheduler command with necessary flags, arguments, and values derived from the values of your object properties. This command is inside the for-loop so that your scheduler gets a command to start a MATLAB worker on the cluster for each task.
The sole purpose of the MATLAB worker's decode function is to read certain job and task information into the MATLAB worker session. This information could be stored in disk files on the network, or it could be available as environment variables on the worker node. Because the discussion of the submit function illustrated only the usage of environment variables, so does this discussion of the decode function.
When working with the decode function, you must be aware of the
Name and location of the decode function itself
Names of the environment variables this function must read

The client's submit function and the worker's decode function work together as a pair. For more information on the submit function, see MATLAB Client Submit Function. The decode function on the worker is identified by the submit function as the value of the environment variable MDCE_DECODE_FUNCTION. The environment variable must be copied from the client node to the worker node. Your scheduler might perform this task for you automatically; if it does not, you must arrange for this copying.
The value of the environment variable MDCE_DECODE_FUNCTION defines the filename of the decode function, but not its location. The file cannot be passed as part of the job PathDependencies or FileDependencies property, because the function runs in the MATLAB worker before that session has access to the job. Therefore, the file location must be available to the MATLAB worker as that worker starts.
You can get the decode function on the worker's path by either moving the file into a directory on the path (for example, matlabroot/toolbox/local), or by having the scheduler use cd in its command so that it starts the MATLAB worker from within the directory that contains the decode function.
In practice, the decode function might be identical for all workers on the cluster. In this case, all workers can use the same decode function file if it is accessible on a shared drive.
When a MATLAB worker starts, it automatically runs the file identified by the MDCE_DECODE_FUNCTION environment variable. This decode function runs before the worker does any processing of its task.
When the environment variables have been transferred from the client to the worker nodes (either by the scheduler or some other means), the decode function of the MATLAB worker can read them with the getenv function.
With those values from the environment variables, the decode function must set the appropriate property values of the object that is its argument. The property values that must be set are the same as those in the corresponding submit function, except that instead of the cell array TaskLocations, each worker has only the individual string TaskLocation, which is one element of the TaskLocations cell array. Therefore, the properties you must set within the decode function on its argument object are as follows:
StorageConstructor
StorageLocation
JobLocation
TaskLocation
The decode function must read four environment variables and use their values to set the properties of the object that is the function's output.
In this example, the decode function's argument is the object props.
function props = workerDecodeFunc(props)
% Read the environment variables:
storageConstructor = getenv('MDCE_STORAGE_CONSTRUCTOR');
storageLocation = getenv('MDCE_STORAGE_LOCATION');
jobLocation = getenv('MDCE_JOB_LOCATION');
taskLocation = getenv('MDCE_TASK_LOCATION');
%
% Set props object properties from the local variables:
set(props, 'StorageConstructor', storageConstructor);
set(props, 'StorageLocation', storageLocation);
set(props, 'JobLocation', jobLocation);
set(props, 'TaskLocation', taskLocation);
When the object is returned from the decode function to the MATLAB worker session, its values are used internally for managing job and task data.
You use the findResource function to create an object representing the scheduler in your local MATLAB client session.
You can specify 'generic' as the name for findResource to search for. (Any scheduler name starting with the string 'generic' creates a generic scheduler object.)
sched = findResource('scheduler', 'type', 'generic')Generic schedulers must use a shared file system for workers to access job and task data. Set the DataLocation and HasSharedFilesystem properties to specify where the job data is stored and that the workers should access job data directly in a shared file system.
set(sched, 'DataLocation', '\\share\scratch\jobdata') set(sched, 'HasSharedFilesystem', true)
Note All nodes require access to the directory specified in the scheduler object's DataLocation directory. See the DataLocation reference page for information on setting this property for a mixed-platform environment. |
If DataLocation is not set, the default location for job data is the current working directory of the MATLAB client the first time you use findResource to create an object for this type of scheduler, which might not be accessible to the worker nodes.
If MATLAB is not on the worker's system path, set the ClusterMatlabRoot property to specify where the workers are to find the MATLAB installation.
set(sched, 'ClusterMatlabRoot', '\\apps\matlab\')
You can look at all the property settings on the scheduler object. If no jobs are in the DataLocation directory, the Jobs property is a 0-by-1 array. All settable property values on a scheduler object are local to the MATLAB client, and are lost when you close the client session or when you remove the object from the client workspace with delete or clear all.
get(sched)
Configuration: ''
Type: 'generic'
DataLocation: '\\share\scratch\jobdata'
HasSharedFilesystem: 1
Jobs: [0x1 double]
ClusterMatlabRoot: '\\apps\matlab\'
ClusterOsType: 'pc'
UserData: []
ClusterSize: Inf
MatlabCommandToRun: 'worker'
SubmitFcn: []
ParallelSubmitFcn: []You must set the SubmitFcn property to specify the submit function for this scheduler.
set(sched, 'SubmitFcn', @mysubmitfunc)
With the scheduler object and the user-defined submit and decode functions defined, programming and running a job is now similar to doing so with a job manager or any other type of scheduler.
You create a job with the createJob function, which creates a job object in the client session. The job data is stored in the directory specified by the scheduler object's DataLocation property.
j = createJob(sched)
This statement creates the job object j in the client session. Use get to see the properties of this job object.
get(j)
Configuration: ''
Name: 'Job1'
ID: 1
UserName: 'neo'
Tag: ''
State: 'pending'
CreateTime: 'Fri Jan 20 16:15:47 EDT 2006'
SubmitTime: ''
StartTime: ''
FinishTime: ''
Tasks: [0x1 double]
FileDependencies: {0x1 cell}
PathDependencies: {0x1 cell}
JobData: []
Parent: [1x1 distcomp.genericscheduler]
UserData: []This generic scheduler job has somewhat different properties than a job that uses a job manager. For example, this job has no callback functions.
The job's State property is pending. This state means the job has not been queued for running yet. This new job has no tasks, so its Tasks property is a 0-by-1 array.
The scheduler's Jobs property is now a 1-by-1 array of distcomp.simplejob objects, indicating the existence of your job.
get(sched)
Configuration: ''
Type: 'generic'
DataLocation: '\\share\scratch\jobdata'
HasSharedFilesystem: 1
Jobs: [1x1 distcomp.simplejob]
ClusterMatlabRoot: '\\apps\matlab\'
ClusterOsType: 'pc'
UserData: []
ClusterSize: Inf
MatlabCommandToRun: 'worker'
SubmitFcn: @mysubmitfunc
ParallelSubmitFcn: []After you have created your job, you can create tasks for the job. Tasks define the functions to be evaluated by the workers during the running of the job. Often, the tasks of a job are identical except for different arguments or data. In this example, each task generates a 3-by-3 matrix of random numbers.
createTask(j, @rand, 1, {3,3});
createTask(j, @rand, 1, {3,3});
createTask(j, @rand, 1, {3,3});
createTask(j, @rand, 1, {3,3});
createTask(j, @rand, 1, {3,3});The Tasks property of j is now a 5-by-1 matrix of task objects.
get(j,'Tasks')
ans =
distcomp.simpletask: 5-by-1Alternatively, you can create the five tasks with one call to createTask by providing a cell array of five cell arrays defining the input arguments to each task.
T = createTask(job1, @rand, 1, {{3,3} {3,3} {3,3} {3,3} {3,3}});In this case, T is a 5-by-1 matrix of task objects.
To run your job and have its tasks evaluated, you submit the job to the scheduler's job queue.
submit(j)
The scheduler distributes the tasks of j to MATLAB workers for evaluation.
The job runs asynchronously. If you need to wait for it to complete before you continue in your MATLAB client session, you can use the waitForState function.
waitForState(j)
The default state to wait for is finished or failed. This function pauses MATLAB until the State property of j is 'finished' or 'failed'.
The results of each task's evaluation are stored in that task object's OutputArguments property as a cell array. Use getAllOutputArguments to retrieve the results from all the tasks in the job.
results = getAllOutputArguments(j);
Display the results from each task.
results{1:5}
0.9501 0.4860 0.4565
0.2311 0.8913 0.0185
0.6068 0.7621 0.8214
0.4447 0.9218 0.4057
0.6154 0.7382 0.9355
0.7919 0.1763 0.9169
0.4103 0.3529 0.1389
0.8936 0.8132 0.2028
0.0579 0.0099 0.1987
0.6038 0.0153 0.9318
0.2722 0.7468 0.4660
0.1988 0.4451 0.4186
0.8462 0.6721 0.6813
0.5252 0.8381 0.3795
0.2026 0.0196 0.8318There are several submit and decode functions provided with the toolbox for your use with the generic scheduler interface. These files are in the directory
matlabroot/toolbox/distcomp/examples/integration
In this directory are subdirectories for each of several types of scheduler, containing wrappers, submit functions, and decode functions for distributed and parallel jobs. For example, the directory matlabroot/toolbox/distcomp/examples/integration/pbs contains the following files for use with a PBS scheduler:
| Filename | Description |
|---|---|
| pbsSubmitFcn.m | Submit function for a distributed job |
| pbsDecodeFunc.m | Decode function for a distributed job |
| pbsParallelSubmitFcn.m | Submit function for a parallel job |
| pbsParallelDecode.m | Decode function for a parallel job |
| pbsWrapper.sh | Script that is submitted to PBS to start workers that evaluate the tasks of a distributed job |
| pbsParallelWrapper.sh | Script that is submitted to PBS to start labs that evaluate the tasks of a parallel job |
Depending on your network and cluster configuration, you might need to modify these files before they will work in your situation. Ask your system administrator for help.
At the time of publication, there are directories for PBS schedulers (pbs), Platform LSF schedulers (lsf), generic UNIX-based scripts (ssh), Sun Grid Engine (sge), and mpiexec on Microsoft Windows operating systems (winmpiexec). In addition, the pbs and lsf directories have subdirectories called nonshared, which contain scripts for use when there is a nonshared file system between the client and cluster computers. Each of these subdirectories contains a file called README, which provides instruction on how to use its scripts.
As more files or solutions might become available at any time, visit the support page for this product on the MathWorks Web site at http://www.mathworks.com/support/product/product.html?product=DM. This page also provides contact information in case you have any questions.
While you can use the get, cancel, and destroy methods on jobs that use the generic scheduler interface, by default these methods access or affect only the job data where it is stored on disk. To cancel or destroy a job or task that is currently running or queued, you must provide instructions to the scheduler directing it what to do and when to do it. To accomplish this, the toolbox provides a means of saving data associated with each job or task from the scheduler, and a set of properties to define instructions for the scheduler upon each cancel or destroy request.
The first requirement for job management is to identify the job from the scheduler's perspective. When you submit a job to the scheduler, the command to do the submission in your submit function can return from the scheduler some data about the job. This data typically includes a job ID. By storing that job ID with the job, you can later refer to the job by this ID when you send management commands to the scheduler. Similarly, you can store information, such as an ID, for each task. The toolbox function that stores this scheduler data is setJobSchedulerData.
If your scheduler accommodates submission of entire jobs (collection of tasks) in a single command, you might get back data for the whole job and/or for each task. Part of your submit function might be structured like this:
for ii = 1:props.NumberOfTasks
define scheduler command per task
end
submit job to scheduler
data_array = parse data returned from scheduler %possibly NumberOfTasks-by-2 matrix
setJobSchedulerData(scheduler, job, data_array)
If your scheduler accepts only submissions of individual tasks, you might get return data pertaining to only each individual tasks. In this case, your submit function might have code structured like this:
for ii = 1:props.NumberOfTasks
submit task to scheduler
%Per-task settings:
data_array(1,ii) = ... parse string returned from scheduler
data_array(2,ii) = ... save ID returned from scheduler
etc
end
setJobSchedulerData(scheduler, job, data_array)With the scheduler data (such as the scheduler's ID for the job or task) now stored on disk along with the rest of the job data, you can write code to control what the scheduler should do when that particular job or task is canceled or destroyed.
For example, you might create these four functions:
myCancelJob.m
myDestroyJob.m
myCancelTask.m
myDestroyTask.m
Your myCancelJob.m function defines what you want to communicate to your scheduler in the event that you use the cancel function on your job from the MATLAB client. The toolbox takes care of the job state and any data management with the job data on disk, so your myCancelJob.m function needs to deal only with the part of the job currently running or queued with the scheduler. The toolbox function that retrieves scheduler data from the job is getJobSchedulerData. Your cancel function might be structured something like this:
function myCancelTask(sched, job)
array_data = getJobSchedulerData(sched, job)
job_id = array_data(...) % Extract the ID from the data, depending on how
% it was stored in the submit function above.
command to scheduler canceling job job_idIn a similar way, you can define what do to for destroying a job, and what to do for canceling and destroying tasks.
After your functions are written, you set the appropriate properties of the scheduler object with handles to your functions. The corresponding scheduler properties are:
You can set the properties in the Configurations Manager for your scheduler, or on the command line:
schdlr = findResource(scheduler, 'type', 'generic'); % set required properties set(schdlr, 'CancelJobFcn', @myCancelJob) set(schdlr, 'DestroyJobFcn', @myDestroyJob) set(schdlr, 'CancelTaskFcn', @myCancelTask) set(schdlr, 'DestroyTaskFcn', @myDestroyTask)
Continue with job creation and submission as usual.
j1 = createJob(schdlr);
for ii = 1:n
t(ii) = createTask(j1,...)
end
submit(j1)While it is running or queued, you can cancel or destroy the job or a task.
This command cancels the task and moves it to the finished state, and triggers execution of myCancelTask, which sends the appropriate commands to the scheduler:
cancel(t(4))
This command deletes job data for j1, and triggers execution of myDestroyJob, which sends the appropriate commands to the scheduler:
destroy(j1)
When using a third-party scheduler, it is possible that the scheduler itself can have more up-to-date information about your jobs than what is available to the toolbox from the job storage location. To retrieve that information from the scheduler, you can write a function to do that, and set the value of the GetJobStateFcn property as a handle to your function.
Whenever you use a toolbox function such as get, waitForState, etc., that accesses the state of a job on the generic scheduler, after retrieving the state from storage, the toolbox runs the function specified by the GetJobStateFcn property, and returns its result in place of the stored state. The function you write for this purpose must return a valid string value for the State of a job object.
The following list summarizes the sequence of events that occur when running a job that uses the generic scheduler interface:
Provide a submit function and a decode function. Be sure the decode function is on all the MATLAB workers' paths.
The following steps occur in the MATLAB client session:
Define the SubmitFcn property of your scheduler object to point to the submit function.
Send your job to the scheduler.
submit(job)
The submit function sets environment variables with values derived from its arguments.
The submit function makes calls to the scheduler — generally, a call for each task (with environment variables identified explicitly, if necessary).
The following step occurs in your network:
The following steps occur in each MATLAB worker session:
The MATLAB worker automatically runs the decode function, finding it on the path.
The decode function reads the pertinent environment variables.
The decode function sets the properties of its argument object with values from the environment variables.
The MATLAB worker uses these object property values in processing its task without your further intervention.
![]() | Using a Fully Supported Third-Party Scheduler | Programming Parallel Jobs | ![]() |

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.
| © 1984-2009- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |