for-Loops and Other
You cannot use a
parfor-loop inside another
parfor-loop. As an example, the following nesting of
parfor-loops is not allowed:
parfor i = 1:10 parfor j = 1:5 ... end end
You cannot nest
parfor directly within another
call a function that contains a
parfor-loop, but you do
not get any additional parallelism.
Code Analyzer in the MATLAB® Editor flags the use of
parfor inside another
You cannot nest
parallelization can be performed at only one level. Therefore, choose which loop to
run in parallel, and convert the other loop to a
Consider the following performance issues when dealing with nested loops:
Parallel processing incurs overhead. Generally, you should run the
outer loop in parallel, because overhead only occurs once. If you run
the inner loop in parallel, then each of the multiple
parfor executions incurs an overhead. See Convert Nested for-Loops to parfor-Loops
for an example how to measure parallel overhead.
Make sure that the number of iterations exceeds the number of workers. Otherwise, you do not use all available workers.
Try to balance the
parfor-loop iteration times.
parfor tries to compensate for some load
Always run the outermost loop in parallel, because you reduce parallel overhead.
You can also use a function that uses
parfor and embed it in a
parfor-loop. Parallelization occurs only at the outer level.
In the following example, call a function
MyFun.m inside the
parfor-loop. The inner
MyFun.m runs sequentially, not in
parfor i = 1:10 MyFun(i) end function MyFun(i) parfor j = 1:5 ... end end
parfor-loops generally give you no computational
A typical use of nested loops is to step through an array using a one-loop variable to index one dimension, and a nested-loop variable to index another dimension. The basic form is:
X = zeros(n,m); for a = 1:n for b = 1:m X(a,b) = fun(a,b) end end
The following code shows a simple example. Use
measure the computing time needed.
A = 100; tic for i = 1:100 for j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end toc
Elapsed time is 49.376732 seconds.
You can parallelize either of the nested loops, but you cannot run both in parallel. The reason is that the workers in a parallel pool cannot start or access further parallel pools.
If the loop counted by
i is converted to
parfor-loop, then each worker in the pool executes
the nested loops using the
j loop counter. The
themselves cannot run as a
parfor on each worker.
Because parallel processing incurs overhead, you must choose
carefully whether you want to convert either the inner or the outer
parfor-loop. The following example shows how
to measure the parallel overhead.
First convert only the outer
measure the computing time needed. Use
measure how much data is transferred to and from the workers in the
Run the new code, and run it again. The first run is slower than subsequent runs, because the parallel pool takes some time to start and make the code available to the workers.
A = 100; tic ticBytes(gcp); parfor i = 1:100 for j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end tocBytes(gcp) toc
BytesSentToWorkers BytesReceivedFromWorkers __________________ ________________________ 1 32984 24512 2 33784 25312 3 33784 25312 4 34584 26112 Total 1.3514e+05 1.0125e+05 Elapsed time is 14.130674 seconds.
Next convert only the inner loop to a
Measure the time needed and data transferred as in the previous case.
A = 100; tic ticBytes(gcp); for i = 1:100 parfor j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end tocBytes(gcp) toc
BytesSentToWorkers BytesReceivedFromWorkers __________________ ________________________ 1 1.3496e+06 5.487e+05 2 1.3496e+06 5.4858e+05 3 1.3677e+06 5.6034e+05 4 1.3476e+06 5.4717e+05 Total 5.4144e+06 2.2048e+06 Elapsed time is 48.631737 seconds.
If you convert the inner loop to a
both the time and amount of data transferred are much greater than
in the parallel outer loop. In this case, the elapsed time is almost
the same as in the nested
for-loop example. The
speedup is smaller than running the outer loop in parallel, because
you have more data transfer and thus more parallel overhead. Therefore
if you execute the inner loop in parallel, you
get no computational benefit compared to running the serial
If you want to reduce parallel overhead and speed up your computation, run the outer loop in parallel.
If you convert the inner loop instead,
then each iteration of the outer loop initiates a separate
That is, the inner loop conversion creates 100
Each of the multiple
parfor executions incurs overhead.
If you want to reduce parallel overhead, you should run the outer
loop in parallel instead, because overhead only occurs once.
If you want to speed up your code, always run the outer loop in parallel, because you reduce parallel overhead.
for-Loops: Requirements and Limitations
If you want to convert a nested
for-loop to a
parfor-loop, you must ensure that your loop variables are
properly classified, see Troubleshoot Variables in parfor-Loops. If your code does not adhere to the guidelines and restrictions labeled as
Required, you get an error. MATLAB catches some of these errors at the time it reads the code, and others
when it executes the code. These errors are labeled as Required (static) or Required
|Required (static): You must
define the range of a |
In the following example, the code on the left does not work because you define
the upper limit of the
for-loop by a function call. The code on
the right provides a workaround by first defining a broadcast or constant variable
A = zeros(100, 200); parfor i = 1:size(A, 1) for j = 1:size(A, 2) A(i, j) = i + j; end end
A = zeros(100, 200); n = size(A, 2); parfor i = 1:size(A,1) for j = 1:n A(i, j) = i + j; end end
|Required (static): The index variable for the
This restriction is required, because changing the nested
for-loop variable in the loop body cannot guarantee that the
region indexed by the
for-loop variable is available at each
The code on the left is not valid because it tries to modify the value of the
j in the body of
the loop. The code on the right provides a workaround by assigning the nested
for-loop variable to a temporary variable
t, and then updating
A = zeros(10); parfor i = 1:10 for j = 1:10 A(i, j) = 1; j = j+1; end end
A = zeros(10); parfor i = 1:10 for j = 1:10 A(i, j) = 1; t = j; t = t + 1; end end
|Required (static): You cannot index or subscript a nested for-loop variable.|
This restriction is required, because indexing a loop variable cannot guarantee the independence of iterations.
The example on the left is invalid because it attempts to index the nested
j. The example on the
right removes this indexing.
A = zeros(10); parfor i = 1:10 for j = 1:10 j(1); end end
A = zeros(10); parfor i = 1:10 for j = 1:10 j; end end
|Required (static): When using
the nested |
For example, the following code on the left does not work, but the code on the right does:
A = zeros(4, 11); parfor i = 1:4 for j = 1:10 A(i, j + 1) = i + j; end end
A = zeros(4, 11); parfor i = 1:4 for j = 2:11 A(i, j) = i + j - 1; end end
|Required (static): If you use a
In the following example, the code on the left does not work because
sliced and indexed inside the nested
for-loop. The code on the
right works because
v is assigned to
of the nested loop:
A = zeros(4, 10); parfor i = 1:4 for j = 1:10 A(i, j) = i + j; end disp(A(i, j)) end
A = zeros(4, 10); parfor i = 1:4 v = zeros(1, 10); for j = 1:10 v(j) = i + j; end disp(v(j)) A(i, :) = v; end
|Required (static): A sliced output variable can be used in only one nested for-loop.|
Suppose that you use multiple
for-loops (not nested inside each other)
parfor-loop, to index into a single sliced array. In
this case, the
for-loops must loop over the same range of
values. In the following example, the code on the left does not work because
k loop over different values. The
code on the right works to index different portions of the sliced array
A = zeros(4, 10); parfor i = 1:4 for j = 1:5 A(i, j) = i + j; end for k = 6:10 A(i, k) = pi; end end
A = zeros(4, 10); parfor i = 1:4 for j = 1:10 if j < 6 A(i, j) = i + j; else A(i, j) = pi; end end end
The body of a
parfor-loop cannot reference a nested
function. However, it can call a nested function by a function handle. Try the
following example. Note that
A(idx) = nfcn(idx) in the
parfor-loop does not work. You must use
feval to invoke the
fcn handle in
function A = pfeg function out = nfcn(in) out = 1 + in; end fcn = @nfcn; parfor idx = 1:10 A(idx) = feval(fcn, idx); end end
>> pfeg Starting parallel pool (parpool) using the 'local' profile ... connected to 4 workers. ans = 2 3 4 5 6 7 8 9 10 11
If you use function handles that refer to nested functions inside a
parfor-loop, then the values of externally scoped
variables are not synchronized among the workers.
The body of a
parfor-loop cannot contain a
parfor-loop. For more information, see Nested parfor-Loops.
The body of a
parfor-loop cannot contain an
spmd statement, and an
statement cannot contain a
parfor-loop. The reason is that
workers cannot start or access further parallel pools.
The body of a
parfor-loop cannot contain
declarations. The reason is that these variables are not synchronized between
workers. You can use
persistent variables within functions, but their value
is visible only to the worker that creates them. Instead of
global variables, it is a better practice to use function
arguments to share values.
To learn more about variable requirements, see Troubleshoot Variables in parfor-Loops.
If a script introduces a variable, you cannot call this script from within a
spmd statement. The
reason is that this script would cause a transparency violation. For more
details, see Ensure Transparency in parfor-Loops or spmd Statements.
You can define an anonymous function inside the body of a
parfor-loop. However, sliced output variables inside
anonymous functions are not supported. You can work around this by using a
temporary variable for the sliced variable, as shown in the following
x = 1:10; parfor i=1:10 temp = x(i); anonymousFunction = @() 2*temp; x(i) = anonymousFunction() + i; end disp(x);
For more information on sliced variables, see Sliced Variables.
inputname to return the workspace variable name
corresponding to an argument number is not supported inside
parfor-loops. The reason is that
parfor workers do not have access to the workspace of the
MATLAB desktop. To work around this, call
parfor, as shown in the following
a = 'a'; myFunction(a) function X = myFunction(a) name = inputname(1); X = ; parfor i=1:2 X = strcat(X,name); end end
The syntaxes of
load that do not assign to an output
structure are not supported inside
parfor, always assign the output of
load to a structure.
The following uses are not supported inside
without a function argument
nargoutchk to validate the number of input
or output arguments in a call to the function that is currently
The reason is that workers do not have access to the workspace of the
MATLAB desktop. To work around this, call these functions before
parfor, as shown in the following
myFunction('a','b') function X = myFunction(a,b) nin = nargin; parfor i=1:2 X(i) = i*nin; end end
You can call P-code script files from within a
but P-code scripts cannot contain a
parfor-loop. To work
around this, use a P-code function instead of a P-code script.