| Parallel Computing Toolbox™ | ![]() |
| On this page… |
|---|
If your network already uses Platform LSF® (Load Sharing Facility), Microsoft® Windows® Compute Cluster Server (CCS), PBS Pro®, or a TORQUE scheduler, you can use Parallel Computing Toolbox™ software to create jobs to be distributed by your existing scheduler. This section provides instructions for using your scheduler.
This section details the steps of a typical programming session with Parallel Computing Toolbox software for jobs distributed to workers by a fully supported third-party scheduler.
This section assumes you have an LSF®, PBS Pro, TORQUE, or CCS scheduler installed and running on your network. For more information about LSF, see http://www.platform.com/Products/. For more information about CCS, see http://www.microsoft.com/windowsserver2003/ccs/default.mspx.
The following sections illustrate how to program Parallel Computing Toolbox software to use these schedulers:
You use the findResource function to identify the type of scheduler and to create an object representing the scheduler in your local MATLAB® client session.
You specify the scheduler type for findResource to search for with one of the following:
sched = findResource('scheduler','type','lsf')
sched = findResource('scheduler','type','pbspro')
sched = findResource('scheduler','type','torque')You set properties on the scheduler object to specify
Where the job data is stored
That the workers should access job data directly in a shared file system
The MATLAB root for the workers to use
set(sched, 'DataLocation', '\\apps\data\project_55') set(sched, 'HasSharedFilesystem', true) set(sched, 'ClusterMatlabRoot', '\\apps\matlab\')
Alternatively, you can use a parallel configuration to find the scheduler and set the object properties with a single findResource statement.
If DataLocation is not set, the default location for job data is the current working directory of the MATLAB client the first time you use findResource to create an object for this type of scheduler. All settable property values on a scheduler object are local to the MATLAB client, and are lost when you close the client session or when you remove the object from the client workspace with delete or clear all.
Note In a shared file system, all nodes require access to the directory specified in the scheduler object's DataLocation directory. See the DataLocation reference page for information on setting this property for a mixed-platform environment. |
You can look at all the property settings on the scheduler object. If no jobs are in the DataLocation directory, the Jobs property is a 0-by-1 array.
get(sched)
Type: 'lsf'
DataLocation: '\\apps\data\project_55'
HasSharedFilesystem: 1
Jobs: [0x1 double]
ClusterMatlabRoot: '\\apps\matlab\'
ClusterOsType: 'unix'
UserData: []
ClusterSize: Inf
ClusterName: 'CENTER_MATRIX_CLUSTER'
MasterName: 'masterhost.clusternet.ourdomain.com'
SubmitArguments: ''
ParallelSubmissionWrapperScript: [1x92 char]
Configuration: ''You use the findResource function to identify the CCS scheduler and to create an object representing the scheduler in your local MATLAB client session.
You specify 'ccs' as the scheduler type for findResource to search for.
sched = findResource('scheduler','type','ccs')
You set properties on the scheduler object to specify
Where the job data is stored
That the workers should access job data directly in a shared file system
The MATLAB root for the workers to use
The operating system of the cluster
The name of the scheduler host
set(sched, 'DataLocation', '\\apps\data\project_106') set(sched, 'HasSharedFilesystem', true) set(sched, 'ClusterMatlabRoot', '\\apps\matlab\') set(sched, 'ClusterOsType', 'pc') set(sched, 'SchedulerHostname', 'server04')
Alternatively, you can use a parallel configuration to find the scheduler and set the object properties with a single findResource statement.
If DataLocation is not set, the default location for job data is the current working directory of the MATLAB client the first time you use findResource to create an object for this type of scheduler. All settable property values on a scheduler object are local to the MATLAB client, and are lost when you close the client session or when you remove the object from the client workspace with delete or clear all.
Note In a shared file system, all nodes require access to the directory specified in the scheduler object's DataLocation directory. |
You can look at all the property settings on the scheduler object. If no jobs are in the DataLocation directory, the Jobs property is a 0-by-1 array.
get(sched)
Type: 'ccs'
DataLocation: '\\apps\data\project_106'
HasSharedFilesystem: 1
Jobs: [0x1 double]
ClusterMatlabRoot: '\\apps\matlab\'
ClusterOsType: 'pc'
UserData: []
ClusterSize: Inf
SchedulerHostname: 'server04'
Configuration: ''You create a job with the createJob function, which creates a job object in the client session. The job data is stored in the directory specified by the scheduler object's DataLocation property.
j = createJob(sched)
This statement creates the job object j in the client session. Use get to see the properties of this job object.
get(j)
Name: 'Job1'
ID: 1
UserName: 'eng1'
Tag: ''
State: 'pending'
CreateTime: 'Fri Jul 29 16:15:47 EDT 2005'
SubmitTime: ''
StartTime: ''
FinishTime: ''
Tasks: [0x1 double]
FileDependencies: {0x1 cell}
PathDependencies: {0x1 cell}
JobData: []
Parent: [1x1 distcomp.lsfscheduler]
UserData: []
Configuration: ''This output varies only slightly between jobs that use LSF and CCS schedulers, but is quite different from a job that uses a job manager. For example, jobs on LSF or CCS schedulers have no callback functions.
The job's State property is pending. This state means the job has not been queued for running yet. This new job has no tasks, so its Tasks property is a 0-by-1 array.
The scheduler's Jobs property is now a 1-by-1 array of distcomp.simplejob objects, indicating the existence of your job.
get(sched, 'Jobs')
Jobs: [1x1 distcomp.simplejob]
You can transfer files to the worker by using the FileDependencies property of the job object. Workers can access shared files by using the PathDependencies property of the job object. For details, see the FileDependencies and PathDependencies reference pages and Sharing Code.
Note In a shared file system, MATLAB clients on many computers can access the same job data on the network. Properties of a particular job or task should be set from only one computer at a time. |
After you have created your job, you can create tasks for the job. Tasks define the functions to be evaluated by the workers during the running of the job. Often, the tasks of a job are all identical except for different arguments or data. In this example, each task will generate a 3-by-3 matrix of random numbers.
createTask(j, @rand, 1, {3,3});
createTask(j, @rand, 1, {3,3});
createTask(j, @rand, 1, {3,3});
createTask(j, @rand, 1, {3,3});
createTask(j, @rand, 1, {3,3});The Tasks property of j is now a 5-by-1 matrix of task objects.
get(j,'Tasks')
ans =
distcomp.simpletask: 5-by-1Alternatively, you can create the five tasks with one call to createTask by providing a cell array of five cell arrays defining the input arguments to each task.
T = createTask(job1, @rand, 1, {{3,3} {3,3} {3,3} {3,3} {3,3}});In this case, T is a 5-by-1 matrix of task objects.
To run your job and have its tasks evaluated, you submit the job to the scheduler's job queue.
submit(j)
The scheduler distributes the tasks of job j to MATLAB workers for evaluation. For each task, the scheduler starts a MATLAB worker session on a worker node; this MATLAB worker session runs for only as long as it takes to evaluate the one task. If the same node evaluates another task in the same job, it does so with a different MATLAB worker session.
The job runs asynchronously with the MATLAB client. If you need to wait for the job to complete before you continue in your MATLAB client session, you can use the waitForState function.
waitForState(j)
The default state to wait for is finished. This function causes MATLAB to pause until the State property of j is 'finished'.
Note When you use an LSF scheduler in a nonshared file system, the scheduler might report that a job is in the finished state even though the LSF scheduler might not yet have completed transferring the job's files. |
The results of each task's evaluation are stored in that task object's OutputArguments property as a cell array. Use getAllOutputArguments to retrieve the results from all the tasks in the job.
results = getAllOutputArguments(j);
Display the results from each task.
results{1:5}
0.9501 0.4860 0.4565
0.2311 0.8913 0.0185
0.6068 0.7621 0.8214
0.4447 0.9218 0.4057
0.6154 0.7382 0.9355
0.7919 0.1763 0.9169
0.4103 0.3529 0.1389
0.8936 0.8132 0.2028
0.0579 0.0099 0.1987
0.6038 0.0153 0.9318
0.2722 0.7468 0.4660
0.1988 0.4451 0.4186
0.8462 0.6721 0.6813
0.5252 0.8381 0.3795
0.2026 0.0196 0.8318Because different machines evaluate the tasks of a job, each machine must have access to all the files needed to evaluate its tasks. The following sections explain the basic mechanisms for sharing data:
If all the workers have access to the same drives on the network, they can access needed files that reside on these shared resources. This is the preferred method for sharing data, as it minimizes network traffic.
You must define each worker session's path so that it looks for files in the correct places. You can define the path by
Using the job's PathDependencies property. This is the preferred method for setting the path, because it is specific to the job.
Putting the path command in any of the appropriate startup files for the worker:
matlabroot\toolbox\local\startup.m
matlabroot\toolbox\distcomp\user\jobStartup.m
matlabroot\toolbox\distcomp\user\taskStartup.m
These files can be passed to the worker by the job's FileDependencies or PathDependencies property. Otherwise, the version of each of these files that is used is the one highest on the worker's path.
A number of properties on task and job objects are for passing code or data from client to scheduler or worker, and back. This information could include M-code necessary for task evaluation, or the input data for processing or output data resulting from task evaluation. All these properties are described in detail in their own reference pages:
InputArguments — This property of each task contains the input data provided to the task constructor. This data gets passed into the function when the worker performs its evaluation.
OutputArguments — This property of each task contains the results of the function's evaluation.
JobData — This property of the job object contains data that gets sent to every worker that evaluates tasks for that job. This property works efficiently because depending on file caching, the data might be passed to a worker node only once per job, saving time if that node is evaluating more than one task for the job.
FileDependencies — This property of the job object lists all the directories and files that get zipped and sent to the workers. At the worker, the data is unzipped, and the entries defined in the property are added to the path of the MATLAB worker session.
PathDependencies — This property of the job object provides pathnames that are added to the MATLAB workers' path, reducing the need for data transfers in a shared file system.
As a session of MATLAB, a worker session executes its startup.m file each time it starts. You can place the startup.m file in any directory on the worker's MATLAB path, such as toolbox/distcomp/user.
Three additional M-files can initialize and clean a worker session as it begins or completes evaluations of tasks for a job:
jobStartup.m automatically executes on a worker when the worker runs its first task of a job.
taskStartup.m automatically executes on a worker each time the worker begins evaluation of a task.
taskFinish.m automatically executes on a worker each time the worker completes evaluation of a task.
Empty versions of these files are provided in the directory
matlabroot/toolbox/distcomp/user
You can edit these files to include whatever M-code you want the worker to execute at the indicated times.
Alternatively, you can create your own versions of these M-files and pass them to the job as part of the FileDependencies property, or include the pathnames to their locations in the PathDependencies property.
The worker gives precedence to the versions provided in the FileDependencies property, then to those pointed to in the PathDependencies property. If any of these files is not included in these properties, the worker uses the version of the file in the toolbox/distcomp/user directory of the worker's MATLAB installation.
For further details on these M-files, see the jobStartup, taskStartup, and taskFinish reference pages.
Objects that the client session uses to interact with the scheduler are only references to data that is actually contained in the directory specified by the DataLocation property. After jobs and tasks are created, you can shut down your client session, restart it, and your job will still be stored in that remote location. You can find existing jobs using the Jobs property of the recreated scheduler object.
The following sections describe how to access these objects and how to permanently remove them:
When you close the client session of Parallel Computing Toolbox software, all of the objects in the workspace are cleared. However, job and task data remains in the directory identified by DataLocation. When the client session ends, only its local reference objects are lost, not the data of the scheduler.
Therefore, if you have submitted your job to the scheduler job queue for execution, you can quit your client session of MATLAB, and the job will be executed by the scheduler. The scheduler maintains its job and task data. You can retrieve the job results later in another client session.
A client session of Parallel Computing Toolbox software can access any of the objects in the DataLocation, whether the current client session or another client session created these objects.
You create scheduler objects in the client session by using the findResource function.
sched = findResource('scheduler', 'type', 'LSF');
set(sched, 'DataLocation', '/apps/data/project_88');
When you have access to the scheduler by the object sched, you can create objects that reference all the data contained in the specified location for that scheduler. All the job and task data contained in the scheduler data location are accessible in the scheduler object's Jobs property, which is an array of job objects.
all_jobs = get(sched, 'Jobs')
You can index through the array all_jobs to locate a specific job.
Alternatively, you can use the findJob function to search in a scheduler object for a particular job identified by any of its properties, such as its State.
finished_jobs = findJob(sched, 'State', 'finished')
This command returns an array of job objects that reference all finished jobs on the scheduler sched, whose data is found in the specified DataLocation.
Jobs in the scheduler continue to exist even after they are finished. From the command line in the MATLAB client session, you can call the destroy function for any job object. If you destroy a job, you destroy all tasks contained in that job. The job and task data is deleted from the DataLocation directory.
For example, find and destroy all finished jobs in your scheduler whose data is stored in a specific directory.
sched = findResource('scheduler', 'name', 'LSF');
set(sched, 'DataLocation', '/apps/data/project_88');
finished_jobs = findJob(sched, 'State', 'finished');
destroy(finished_jobs);
clear finished_jobs
The destroy function in this example permanently removes from the scheduler data those finished jobs whose data is in /apps/data/project_88. The clear function removes the object references from the local MATLAB client workspace.
![]() | Using a Job Manager | Using the Generic Scheduler Interface | ![]() |
| © 1984-2008- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |