| Parallel Computing Toolbox™ | ![]() |
| On this page… |
|---|
If you already have a coarse-grained application to perform, but you do not want to bother with the overhead of defining jobs and tasks, you can take advantage of the ease-of-use that the interactive parallel mode provides. Where an existing program might take hours or days to process all its independent data sets, you can shorten that time by distributing these independent computations over your cluster.
For example, suppose you have the following serial code:
results = zeros(1, numDataSets);
for i = 1:numDataSets
load(['\\central\myData\dataSet' int2str(i) '.mat'])
results(i) = processDataSet(i);
end
plot(1:numDataSets, results);
save \\central\myResults\today.mat resultsThe following changes make this code operate in parallel, either interactively in pmode, or in a parallel job:
results = zeros(1, numDataSets, distributor());
for i = drange(1:numDataSets)
load(['\\central\myData\dataSet' int2str(i) '.mat'])
results(i) = processDataSet(i);
end
res = gather(results, 1);
if labindex == 1
plot(1:numDataSets, res);
print -dtiff -r300 fig.tiff;
save \\central\myResults\today.mat res
endNote that the length of the for iteration and the length of the distributed array results need to match in order to index into results within a for drange loop. This way, no communication is required between the labs. If results was simply a replicated array, as it would have been when running the original code in parallel, each lab would have assigned into its part of results, leaving the remaining parts of results 0. At the end, results would have been a variant, and without explicitly calling labSend and labReceive or gcat, there would be no way to get the total results back to one (or all) labs.
When using the load function, you need to be careful that the data files are accessible to all labs if necessary. The best practice is to use explicit paths to files on a shared file system.
Correspondingly, when using the save function, you should be careful to only have one lab save to a particular file (on a shared file system) at a time. Thus, wrapping the code in if labindex == 1 is recommended.
Because results is distributed across the labs, this example uses gather to collect the data onto lab 1.
A lab cannot plot a visible figure, so the print function creates a viewable file of the plot.
When a for-loop over a distributed range is executed in a parallel job, each lab performs its portion of the loop, so that the labs are all working simultaneously. Because of this, no communication is allowed between the labs while executing a for-drange loop. In particular, a lab has access only to its partition of a distributed array. Any calculations in such a loop that require a lab to access portions of a distributed array from another lab will generate an error.
To illustrate this characteristic, you can try the following example, in which one for loop works, but the other does not.
At the pmode prompt, create two distributed arrays, one an identity matrix, the other set to zeros, distributed across four labs.
D = eye(8, 8, distributor()) E = zeros(8, 8, distributor())
By default, these arrays are distributed by columns; that is, each of the four labs contains two columns of each array. If you use these arrays in a for-drange loop, any calculations must be self-contained within each lab. In other words, you can only perform calculations that are limited within each lab to the two columns of the arrays that the labs contain.
For example, suppose you want to set each column of array E to some multiple of the corresponding column of array D:
for j = drange(1:size(D,2)); E(:,j) = j*D(:,j); end
This statement sets the j-th column of E to j times the j-th column of D. In effect, while D is an identity matrix with 1s down the main diagonal, E has the sequence 1, 2, 3, etc., down its main diagonal.
This works because each lab has access to the entire column of D and the entire column of E necessary to perform the calculation, as each lab works independently and simultaneously on two of the eight columns.
Suppose, however, that you attempt to set the values of the columns of E according to different columns of D:
for j = drange(1:size(D,2)); E(:,j) = j*D(:,j+1); end
This method fails, because when j is 2, you are trying to set the second column of E using the third column of D. These columns are stored in different labs, so an error occurs, indicating that communication between the labs is not allowed.
![]() | Working with Distributed Arrays | Using MATLAB® Functions on Distributed Arrays | ![]() |
| © 1984-2008- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |