Products & Services Solutions Academia Support User Community Company

Learn more about Parallel Computing Toolbox   

Further Notes on Parallel Jobs

Number of Tasks in a Parallel Job

Although you create only one task for a parallel job, the system copies this task for each worker that runs the job. For example, if a parallel job runs on four workers (labs), the Tasks property of the job contains four task objects. The first task in the job's Tasks property corresponds to the task run by the lab whose labindex is 1, and so on, so that the ID property for the task object and labindex for the lab that ran that task have the same value. Therefore, the sequence of results returned by the getAllOutputArguments function corresponds to the value of labindex and to the order of tasks in the job's Tasks property.

Avoiding Deadlock and Other Dependency Errors

Because code running in one lab for a parallel job can block execution until some corresponding code executes on another lab, the potential for deadlock exists in parallel jobs. This is most likely to occur when transferring data between labs or when making code dependent upon the labindex in an if statement. Some examples illustrate common pitfalls.

Suppose you have a codistributed array D, and you want to use the gather function to assemble the entire array in the workspace of a single lab.

if labindex == 1
    assembled = gather(D);
end

The reason this fails is because the gather function requires communication between all the labs across which the array is distributed. When the if statement limits execution to a single lab, the other labs required for execution of the function are not executing the statement. As an alternative, you can use gather itself to collect the data into the workspace of a single lab: assembled = gather(D, 1).

In another example, suppose you want to transfer data from every lab to the next lab on the right (defined as the next higher labindex). First you define for each lab what the labs on the left and right are.

from_lab_left = mod(labindex - 2, numlabs) + 1;
to_lab_right  = mod(labindex, numlabs) + 1;

Then try to pass data around the ring.

labSend (outdata, to_lab_right);
indata = labReceive(from_lab_left);

The reason this code might fail is because, depending on the size of the data being transferred, the labSend function can block execution in a lab until the corresponding receiving lab executes its labReceive function. In this case, all the labs are attempting to send at the same time, and none are attempting to receive while labSend has them blocked. In other words, none of the labs get to their labReceive statements because they are all blocked at the labSend statement. To avoid this particular problem, you can use the labSendReceive function.

  


Recommended Products

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.

 © 1984-2009- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS