| Contents | Index |
| On this page… |
|---|
This section details the steps of a typical programming session with Parallel Computing Toolbox software using a supported job scheduler on a cluster. Supported schedulers include the MATLAB job scheduler (MJS), Platform LSF (Load Sharing Facility), Microsoft Windows HPC Server (including CCS), PBS Pro, or a TORQUE scheduler.
This section assumes you have anMJS, LSF, PBS Pro, TORQUE, or Windows HPC Server (including CCS and HPC Server 2008) scheduler installed and running on your network. For more information about LSF, see http://www.platform.com/Products/. For more information about Windows HPC Server, see http://www.microsoft.com/hpc. With all of these cluster types, the basic job programming sequence is the same:
Note that the objects that the client session uses to interact with the MJS are only references to data that is actually contained in the MJS, not in the client session. After jobs and tasks are created, you can close your client session and restart it, and your job is still stored in the MJS. You can find existing jobs using the findJob function or the Jobs property of the MJS cluster object.
A cluster profile identifies the type of cluster to use and its specific properties. In a profile, you define how many workers a job can access, where the job data is stored, where MATLAB is accessed and many other cluster properties. The exact properties are determined by the type of cluster.
The step in this section all assume the profile with the name MyProfile identifies the cluster you want to use, with all necessary property settings. With the proper use of a profile, the rest of the programming is the same, regardless of cluster type. After you define or import your profile, you can set it as the default profile in the Profile Manager GUI, or with the command:
parallel.defaultClusterProfile('MyProfile')A few notes regarding different cluster types and their properties:
Notes In a shared file system, all nodes require access to the folder specified in the cluster object's JobStorageLocation property. Because Windows HPC Server requires a shared file system, all nodes require access to the folder specified in the cluster object's JobStorageLocation property. In a shared file system, MATLAB clients on many computers can access the same job data on the network. Properties of a particular job or task should be set from only one client computer at a time. When you use an LSF scheduler in a nonshared file system, the scheduler might report that a job is in the finished state even though the LSF scheduler might not yet have completed transferring the job's files. |
You use the parcluster function to identify a cluster and to create an object representing the cluster in your local MATLAB session.
To find a specific cluster, user the cluster profile to match the properties of the cluster you want to use. In this example, MyProfile is the name of the profile that defines the specific cluster.
c = parcluster('MyProfile');c =
MJS Cluster Information
=======================
Profile: MyProfile
Modified: false
Host: node345
NumWorkers: 1
JobStorageLocation: Database on node345
ClusterMatlabRoot: C:\apps\matlab
OperatingSystem: windows
- Assigned Jobs
Number Pending: 0
Number Queued: 0
Number Running: 0
Number Finished: 0
- MJS Specific Properties
Name: my_mjs
AllHostAddresses: 0:0:0:0
NumBusyWorkers: 0
NumIdleWorkers: 1
Username: mylogin
SecurityLevel: 0 (No security)
IsUsingSecureCommunication: falseYou can view all the accessible properties of the cluster object with the get function:
get(c)
You create a job with the createJob function. Although this command executes in the client session, it actually creates the job on the job manager, jm, and creates a job object, job1, in the client session.
job1 = createJob(c)
Job ID 91 Information
=====================
Type: Independent
Username: mylogin
State: pending
SubmitTime:
StartTime:
Running Duration: 0 days 0h 0m 0s
- Data Dependencies
AttachedFiles: {}
AdditionalPaths: {}
- Associated Task(s)
Number Pending: 0
Number Running: 0
Number Finished: 0
Task ID of Errors: []Use get to see all the accessible properties of this job object.
get(job1)
Note that the job's State property is pending. This means the job has not been queued for running yet, so you can now add tasks to it.
The cluster's display now includes one pending job, as shown in this partial listing:
c
MJS Cluster Information
=======================
- Assigned Jobs
Number Pending: 1
Number Queued: 0
Number Running: 0
Number Finished: 0You can transfer files to the worker by using the AttachedFiles property of the job object. For details, see Share Code.
After you have created your job, you can create tasks for the job using the createTask function. Tasks define the functions to be evaluated by the workers during the running of the job. Often, the tasks of a job are all identical. In this example, each task will generate a 3-by-3 matrix of random numbers.
createTask(job1, @rand, 1, {3,3});
createTask(job1, @rand, 1, {3,3});
createTask(job1, @rand, 1, {3,3});
createTask(job1, @rand, 1, {3,3});
createTask(job1, @rand, 1, {3,3});The Tasks property of job1 is now a 5-by-1 matrix of task objects.
get(job1,'Tasks')
MJSTask: 5-by-1
================
# ID State FinishTime Function Error
-----------------------------------------------------
1 1 pending @rand
2 2 pending @rand
3 3 pending @rand
4 4 pending @rand
5 5 pending @rand
Alternatively, you can create the five tasks with one call to createTask by providing a cell array of five cell arrays defining the input arguments to each task.
T = createTask(job1, @rand, 1, {{3,3} {3,3} {3,3} {3,3} {3,3}});In this case, T is a 5-by-1 matrix of task objects.
To run your job and have its tasks evaluated, you submit the job to the job queue with the submit function.
submit(job1)
The job manager distributes the tasks of job1 to its registered workers for evaluation.
Each worker performs the following steps for task evaluation:
Receive AttachedFiles and AdditionalPaths from the job. Place files and modify the path accordingly.
Run the jobStartup function the first time evaluating a task for this job. You can specify this function in AttachedFiles or AdditionalPaths. When using an MJS, ff the same worker evaluates subsequent tasks for this job, jobStartup does not run between tasks.
Run the taskStartup function. You can specify this function in AttachedFiles or AdditionalPaths. This runs before every task evaluation that the worker performs, so it could occur multiple times on a worker for each job.
If the worker is part of forming a new MATLAB pool, run the poolStartup function. (This occurs when executing matlabpool open or when running other types of jobs that form and use a MATLAB pool.)
Receive the task function and arguments for evaluation.
Evaluate the task function, placing the result in the task's OutputArguments property. Any error information goes in the task's Error property.
Run the taskFinish function.
The results of each task's evaluation are stored in that task object's OutputArguments property as a cell array. Use the function fetchOutputs to retrieve the results from all the tasks in the job.
wait(job1) results = fetchOutputs(job1);
Display the results from each task.
results{1:5}
0.9501 0.4860 0.4565
0.2311 0.8913 0.0185
0.6068 0.7621 0.8214
0.4447 0.9218 0.4057
0.6154 0.7382 0.9355
0.7919 0.1763 0.9169
0.4103 0.3529 0.1389
0.8936 0.8132 0.2028
0.0579 0.0099 0.1987
0.6038 0.0153 0.9318
0.2722 0.7468 0.4660
0.1988 0.4451 0.4186
0.8462 0.6721 0.6813
0.5252 0.8381 0.3795
0.2026 0.0196 0.8318Because the tasks of a job are evaluated on different machines, each machine must have access to all the files needed to evaluate its tasks. The basic mechanisms for sharing code are explained in the following sections:
If the workers all have access to the same drives on the network, they can access the necessary files that reside on these shared resources. This is the preferred method for sharing data, as it minimizes network traffic.
You must define each worker session's search path so that it looks for files in the right places. You can define the path:
By using the job's AdditionalPaths property. This is the preferred method for setting the path, because it is specific to the job.
By putting the path command in any of the appropriate startup files for the worker:
matlabroot\toolbox\local\startup.m
matlabroot\toolbox\distcomp\user\jobStartup.m
matlabroot\toolbox\distcomp\user\taskStartup.m
These files can be passed to the worker by the job's AttachedFiles or AdditionalPaths property. Otherwise, the version of each of these files that is used is the one highest on the worker's path.
Access to files among shared resources can depend upon permissions based on the user name. You can set the user name with which the MJS and worker services of MATLAB Distributed Computing Server software run by setting the MDCEUSER value in the mdce_def file before starting the services. For Microsoft Windows operating systems, there is also MDCEPASS for providing the account password for the specified user. For an explanation of service default settings and the mdce_def file, see Define Script Defaults in the MATLAB Distributed Computing Server System Administrator's Guide.
A number of properties on task and job objects are designed for passing code or data from client to MJS to worker, and back. This information could include MATLAB code necessary for task evaluation, or the input data for processing or output data resulting from task evaluation. The following properties facilitate this communication:
InputArguments — This property of each task contains the input data provided to the task constructor. This data gets passed into the function when the worker performs its evaluation.
OutputArguments — This property of each task contains the results of the function's evaluation.
JobData — This property of the job object contains data that gets sent to every worker that evaluates tasks for that job. This property works efficiently because the data is passed to a worker only once per job, saving time if that worker is evaluating more than one task for the job.
AttachedFiles — This property of the job object lists all the folders and files that get zipped and sent to the workers. On the worker, the data is unzipped, and the entries defined in the property are added to the search path of the MATLAB worker session.
AdditionalPaths — This property of the job object provides paths that are added to the MATLAB workers' search path, reducing the need for data transfers in a shared file system.
There is a default maximum amount of data that can be sent in a single call for setting properties. This limit applies to the OutputArguments property as well as to data passed into a job as input arguments or AttachedFiles. If the limit is exceeded, you get an error message. For more information about this data transfer size limit, see Object Data Size Limitations.
As a session of MATLAB, a worker session executes its startup.m file each time it starts. You can place the startup.m file in any folder on the worker's MATLAB search path, such as toolbox/distcomp/user.
These additional files can initialize and clean up a worker session as it begins or completes evaluations of tasks for a job:
jobStartup.m automatically executes on a worker when the worker runs its first task of a job.
taskStartup.m automatically executes on a worker each time the worker begins evaluation of a task.
poolStartup.m automatically executes on a worker each time the worker is included in a newly started MATLAB pool.
taskFinish.m automatically executes on a worker each time the worker completes evaluation of a task.
Empty versions of these files are provided in the folder:
matlabroot/toolbox/distcomp/user
You can edit these files to include whatever MATLAB code you want the worker to execute at the indicated times.
Alternatively, you can create your own versions of these files and pass them to the job as part of the AttachedFiles property, or include the path names to their locations in the AdditionalPaths property.
The worker gives precedence to the versions provided in the AttachedFiles property, then to those pointed to in the AdditionalPaths property. If any of these files is not included in these properties, the worker uses the version of the file in the toolbox/distcomp/user folder of the worker's MATLAB installation.
Because all the data of jobs and tasks resides in the cluster job storage location, these objects continue to exist even if the client session that created them has ended. The following sections describe how to access these objects and how to permanently remove them:
When you close the client session of Parallel Computing Toolbox software, all of the objects in the workspace are cleared. However, the objects in MATLAB Distributed Computing Server software or other cluster resources remain in place. When the client session ends, only the local reference objects are lost, not the actual objects in the cluster.
Therefore, if you have submitted your job to the job queue for execution, you can quit your client session of MATLAB, and the job will be executed by the cluster. You can retrieve the job results later in another client session.
A client session of Parallel Computing Toolbox software can access any of the objects in MATLAB Distributed Computing Server software, whether the current client session or another client session created these objects.
You create cluster objects in the client session by using the parcluster function.
c = parcluster('MyProfile');
When you have access to the MJS cluster by the object c, you can create objects that reference all those objects contained in that MJS. All the jobs contained in the MJS are accessible in cluster object's Jobs property, which is an array of job objects:
all_jobs = get(c,'Jobs')
You can index through the array all_jobs to locate a specific job.
Alternatively, you can use the findJob function to search in a cluster for any jobs or a particular job identified by any of its properties, such as its State.
all_jobs = findJob(c); finished_jobs = findJob(c,'State','finished')
This command returns an array of job objects that reference all finished jobs on the MJS cluster c.
When restarting a client session, you lose the settings of any callback properties (for example, the FinishedFcn property) on jobs or tasks. These properties are commonly used to get notifications in the client session of state changes in their objects. When you create objects in a new client session that reference existing jobs or tasks, you must reset these callback properties if you intend to use them.
Jobs in the cluster continue to exist even after they are finished, and after the MJS is stopped and restarted. The ways to permanently remove jobs from the job manager are explained in the following sections:
Delete Selected Objects. From the command line in the MATLAB client session, you can call the delete function for any job or task object. If you delete a job, you also remove all tasks contained in that job.
For example, find and delete all finished jobs in your cluster that belong to the user joep.
c = parcluster('MyProfile')
finished_jobs = findJob(c,'State','finished','Username','joep')
delete(finished_jobs)
clear finished_jobsThe delete function permanently removes these jobs from the cluster. The clear function removes the object references from the local MATLAB workspace.
Start an MJS from a Clean State. When an MJS starts, by default it starts so that it resumes its former session with all jobs intact. Alternatively, an MJS can start from a clean state with all its former history deleted. Starting from a clean state permanently removes all job and task data from the MJS of the specified name on a particular host.
As a network administration feature, the -clean flag of the startjobmanager script is described in Start in a Clean State in the MATLAB Distributed Computing Server System Administrator's Guide.
![]() | Use a Local Cluster | Use the Generic Scheduler Interface | ![]() |

See how to solve large problems with minimal effort and reduce simulation time.
Get free kit| © 1984-2012- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |