| Parallel Computing Toolbox™ | ![]() |
| On this page… |
|---|
How MATLAB® Software Distributes Arrays Obtaining Information About the Array Changing the Dimension of Distribution |
When you distribute an array to a number of labs, MATLAB® software partitions the array into segments and assigns one segment of the array to each lab. You can partition a two-dimensional array horizontally, assigning columns of the original array to the different labs, or vertically, by assigning rows. An array with N dimensions can be partitioned along any of its N dimensions. You choose which dimension of the array is to be partitioned by specifying it in the array constructor command.
For example, to distribute an 80-by-1000 array to four labs, you can partition it either by columns, giving each lab an 80-by-250 segment, or by rows, with each lab getting a 20-by-1000 segment. If the array dimension does not divide evenly over the number of labs, MATLAB partitions it as evenly as possible.
The following example creates an 80-by-1000 replicated array and assigns it to variable A. In doing so, each lab creates an identical array in its own workspace and assigns it to variable A, where A is local to that lab. The second command distributes A, creating a single 80-by-1000 array D that spans all four labs. lab 1 stores columns 1 through 250, lab 2 stores columns 251 through 500, and so on. The default distribution is by the last nonsingleton dimension, thus, columns in this case of a 2-dimesional array.
A = zeros(80, 1000); D = distributed(A, 'convert') 1: localPart(D) is 80-by-250 2: localPart(D) is 80-by-250 3: localPart(D) is 80-by-250 4: localPart(D) is 80-by-250
Each lab has access to all segments of the array. Access to the local segment is faster than to a remote segment, because the latter requires sending and receiving data between labs and thus takes more time.
The MATLAB Parallel Command Window displays the local segments of a distributed array in tabbed or tiled windows for each lab, or combined into one display as shown below. Each lab displays that part of the array that is stored in its workspace. This part of the array is said to be local to that lab. The lab index appears at the left.
1: localPart(D) = 1: 11 12 1: 21 22 1: 31 32 1: 41 42 2: localPart(D) = 2: 13 14 2: 23 24 2: 33 34 2: 43 44
When displaying larger distributed arrays, MATLAB prints out only the sizes of the local segments.
1: localPart(D) is 4-by-250 2: localPart(D) is 4-by-250 3: localPart(D) is 4-by-250 4: localPart(D) is 4-by-250
Note When displayed, a distributed array can look the same as a smaller variant array. For example, on a configuration with four labs, a 4-by-20 distributed array might appear to be the same size as a 4-by-5 variant array because both are displayed as 4-by-5 in each lab window. You can tell the difference either by finding the size of the array or by using the isdistributed function. |
In distributing an array of N rows, if N is evenly divisible by the number of labs, MATLAB stores the same number of rows (N/numlabs) on each lab. When this number is not evenly divisible by the number of labs, MATLAB partitions the array as evenly as possible.
MATLAB provides a functions called distributionDimension and distributionPartition that you can use to determine the exact distribution of an array. See dcolon Indexing Function for more information on dcolon.
You can distribute arrays of any MATLAB built-in data type, and also numeric arrays that are complex or sparse, but not arrays of function handles or object types.
You can create a distributed array in any of the following ways:
Partitioning a Larger Array — Start with a large array that is replicated on all labs, and partition it so that the pieces are distributed across the labs. This is most useful when you have sufficient memory to store the initial replicated array.
Building from Smaller Arrays — Start with smaller variant or replicated arrays stored on each lab, and combine them so that each array becomes a segment of a larger distributed array. This method saves on memory as it lets you build a distributed array from smaller pieces.
Using MATLAB® Constructor Functions — Use any of the MATLAB constructor functions like rand or zeros with the distributor() argument. These functions offer a quick means of constructing a distributed array of any size in just one step.
If you have a large array already in memory that you want MATLAB to process more quickly, you can partition it into smaller segments and distribute these segments to all of the labs using the distributed function. Each lab then has an array that is a fraction the size of the original, thus reducing the time required to access the data that is local to each lab.
As a simple example, the following line of code creates a 4-by-8 replicated matrix on each lab assigned to the variable A:
P>> A = [11:18; 21:28; 31:38; 41:48]
A =
11 12 13 14 15 16 17 18
21 22 23 24 25 26 27 28
31 32 33 34 35 36 37 38
41 42 43 44 45 46 47 48The next line uses the distributed function to construct a single 4-by-8 matrix D that is distributed along the second dimension of the array:
P>> D = distributed(A, 'convert')
1: localPart(D)| 2: localPart(D)| 3: localPart(D)| 4: localPart(D)
11 12 | 13 14 | 15 16 | 17 18
21 22 | 23 24 | 25 26 | 27 28
31 32 | 33 34 | 35 36 | 37 38
41 42 | 43 44 | 45 46 | 47 48Note that arrays A and D are the same size (4-by-8). Array A exists in its full size on each lab, while only a segment of array D exists on each lab.
P>> whos Name Size Bytes Class A 4x8 256 double D 4x8 460 distributed
See the distributed function reference page for syntax and usage information.
The distributed function is less useful for reducing the amount of memory required to store data when you first construct the full array in one workspace and then partition it into distributed segments. To save on memory, you can construct the smaller pieces (local part) on each lab first, and then combine them into a single array that is distributed across the labs.
This example creates a 4-by-250 variant array A on each of four labs and then uses distributor to distribute these segments across four labs, creating a 4-by-1000 distributed array. Here is the variant array, A:
P>> A = [1:250; 251:500; 501:750; 751:1000] + 250 * (labindex - 1);
LAB 1 | LAB 2 LAB 3
1 2 ... 250 | 251 252 ... 500 | 501 502 ... 750 | etc.
251 252 ... 500 | 501 502 ... 750 | 751 752 ...1000 | etc.
501 502 ... 750 | 751 752 ...1000 | 1001 1002 ...1250 | etc.
751 752 ...1000 | 1001 1002 ...1250 | 1251 1252 ...1500 | etc.
| | |Now combine these segments into an array that is distributed across the first (or vertical) dimension. The array is now 16-by-250, with a 4-by-250 segment residing on each lab:
P>> D = distributed(A, distributor('1d',1)
1: localPart(D) is 4-by-250
2: localPart(D) is 4-by-250
3: localPart(D) is 4-by-250
4: localPart(D) is 4-by-250
P>> whos
Name Size Bytes Class
A 4x250 8000 double
D 16x250 8396 distributedYou could also use replicated arrays in the same fashion, if you wanted to create a distributed array whose segments were all identical to start with. See the distributed function reference page for syntax and usage information.
MATLAB provides several array constructor functions that you can use to build distributed arrays of specific values, sizes, and classes. These functions operate in the same way as their nondistributed counterparts in the MATLAB language, except that they distribute the resultant array across the labs using the specified distributor, dist.
Constructor Functions. The distributed constructor functions are listed here. Use the dist argument (created by the distributor function: dist=distributor()) to specify over which dimension to distribute the array. See the individual reference pages for these functions for further syntax and usage information.
cell(m, n, ..., dist) eye(m, ..., classname, dist) false(m, n, ..., dist) Inf(m, n, ..., classname, dist) NaN(m, n, ..., classname, dist) ones(m, n, ..., classname, dist) rand(m, n, ..., dist) randn(m, n, ..., dist) sparse(m, n, dist) speye(m, ..., dist) sprand(m, n, density, dist) sprandn(m, n, density, dist) true(m, n, ..., dist) zeros(m, n, ..., classname, dist)
That part of a distributed array that resides on each lab is a piece of a larger array. Each lab can work on its own segment of the common array, or it can make a copy of that segment in a variant or private array of its own. This local copy of a distributed array segment is called a local array.
The localPart function copies the segments of a distributed array to a separate variant array. This example makes a local copy L of each segment of distributed array D. The size of L shows that it contains only the local part of D for each lab. Suppose you distribute an array across four labs:
P>> A = [1:80; 81:160; 161:240];
P>> D = distributed(A, 'convert');
P>> size(D)
ans =
3 80
P>> L = localPart(D);
P>> size(L)
ans =
3 20Each lab recognizes that the distributed array D is 3-by-80. However, notice that the size of the local part, L, is 3-by-20 on each lab, because the 80 columns of D are distributed over four labs.
Use the distributed function to perform the reverse operation. This function, described in Building from Smaller Arrays, combines the local variant arrays into a single array distributed along the specified dimension.
Continuing the previous example, take the local variant arrays L and put them together as segments of a new distributed array X.
P>> X = distributed(L);
P>> size(X)
ans =
3 80MATLAB offers several functions that provide information on any particular array. In addition to these standard functions, there are also two functions that are useful solely with distributed arrays.
The isdistributed function returns a logical 1 (true) if the input array is distributed, and logical 0 (false) otherwise. The syntax is
P>> TF = isdistributed(D)
where D is any MATLAB array.
The distributionDimension function returns a number that represents the dimension of distribution of a distributed array, and the distributionPartition function returns a vector that describes how the array is partitioned along its dimension of distribution.
The syntax is
P>> distributionDimension(distributor(D)) P>> distributionPartition(distributor(D))
where D is any distributed array. For a 250-by-10 matrix distributed across four labs by columns,
P>> D = ones(250, 10, distributor()) 1: localPart(D) is 250-by-3 2: localPart(D) is 250-by-3 3: localPart(D) is 250-by-2 4: localPart(D) is 250-by-2 P>> dim = distributionDimension(distributor(D)) 1: dim = 2 P>> part = distributionPartition(distributor(D)) 1: part = [3 3 2 2]
The distributionDimension(distributor(D)) value of 2 means the array is distributed by columns (dimension 2); and the distributionPartition(distributor(D)) value of [3 3 2 2] means that the first three columns reside in the lab 1, the next three columns in lab 2, the next two columns in lab 3, and the final two columns in lab 4.
Other functions that provide information about standard arrays also work on distributed arrays and use the same syntax.
ndims — Returns the number of dimensions.
size — Returns the size of each dimension.
length — Returns the length of a specific dimension.
isa — Returns information about a number of array characteristics.
is* — All functions that have names beginning with 'is', such as ischar and issparse.
numel Not Supported on Distributed Arrays. For a distributed array, the numel function does not return the number of elements, but instead always returns a value of 1.
When constructing an array, you distribute the parts of the array along one of the array's dimensions. You can change the direction of this distribution on an existing array using the distributed function.
Construct an 8-by-16 distributed array D of random values having distributed columns:
P>> D = rand(8, 16, distributor());
P>> size(localPart(D))
ans =
8 4Create a new array from this one that has distributed rows:
P>> X = redistribute(D, distributor('1d', 1));
P>> size(localPart(X))
ans =
2 16You can restore a distributed array to its undistributed form using the gather function. gather takes the segments of an array that reside on different labs and combines them into a replicated array on all labs, or into a single array on one lab.
Distribute a 4-by-10 array to four labs along the second dimension:
P>> A = [11:20; 21:30; 31:40; 41:50]
A =
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
P>> D = distributed(A, 'convert')
Lab 1 | Lab 2 | Lab 3 | Lab 4
11 12 13 | 14 15 16 | 17 18 | 19 20
21 22 23 | 24 25 26 | 27 28 | 29 30
31 32 33 | 34 35 36 | 37 38 | 39 40
41 42 43 | 44 45 46 | 47 48 | 49 50
| | |
P>> size(localPart(D))
1: ans =
1: 4 3
2: ans =
2: 4 3
3: ans =
3: 4 2
4: ans =
4: 4 2Restore the undistributed segments to the full array form by gathering the segments:
P>> X = gather(D)
X =
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
P>> size(X)
ans =
4 10While indexing into a nondistributed array is fairly straightforward, distributed arrays require additional considerations. Each dimension of a nondistributed array is indexed within a range of 1 to the final subscript, which is represented in MATLAB by the end keyword. The length of any dimension can be easily determined using either the size or length function.
With distributed arrays, these values are not so easily obtained. For example, the second segment of an array (that which resides in the workspace of lab 2) has a starting index that depends on the array distribution. For a 200-by-1000 array with a default distribution by columns over four labs, the starting index on lab 2 is 251. For a 1000-by-200 array also distributed by columns, that same index would be 51. As for the ending index, this is not given by using the end keyword, as end in this case refers to the end of the entire array; that is, the last subscript of the final segment. The length of each segment is also not given by using the length or size functions, as they only return the length of the entire array.
Note When using arrays to index into distributed arrays, you can use only replicated or distributed arrays for indexing. The toolbox does not check to ensure that the index is replicated, as that would require global communications. Therefore, the use of unsupported variants (such as labindex) to index into distributed arrays might create unexpected results. |
The MATLAB colon operator and end keyword are two of the basic tools for indexing into nondistributed arrays. For distributed arrays, MATLAB provides a distributed version of the colon operator, called dcolon. This actually is a function, not a symbolic operator like colon.
The dcolon function returns a distributed vector of length L that maps the subscripts of an equivalent array residing on the same lab configuration. An equivalent array is an array for which the distributed dimension is also of length L. For example, the subscripts of a 50-element dcolon vector are as follows:
[1:13] for Lab 1 [14:26] for Lab 2 [27:38] for Lab 3 [39:50] for Lab 4
This vector shows how MATLAB would distribute 50 rows, columns, or any dimension of an array in a configuration having the same number of labs (four in this case). A 50-row, 10-column array, for example, with the rows distributed over four labs
P>> D = rand(50, 10, distributor('1d',1));will have rows 1 through 13 stored on lab 1, rows 14 through 26 on lab 2, rows 27 through 38 on lab 3, and rows 39 through 50 on lab 4.
The syntax for dcolon is as follows. The step input argument is optional:
P>> V = dcolon(first, step, last)
Inputs to dcolon are shown below. Each input must be a real scalar integer value.
| Input Argument | Description |
|---|---|
| first | Number of the first subscript in this dimension. |
| step | Size of the interval between numbers in the generated sequence. Optional; the default is 1. |
| last | Number of the last subscript in this dimension. |
To use dcolon to index into the 50-by-10 distributed array in the previous example, first generate the vector V that shows how the 50-row dimension is partitioned. Then you can use the elements of this vector to derive the range of rows that apply to particular segments of the array. For example:
P>> V = dcolon(1, length(D))
As an alternative to distributing by a single dimension of rows or columns, you can distribute a matrix by blocks using '2d' or two-dimensional distribution. Instead of segments that comprise a number of complete rows or columns of the matrix, the segments of the distributed array are 2-dimensional square blocks.
For example, consider a simple 8-by-8 matrix with ascending element values. You can create this array in a parallel job or in pmode:
P>> A = reshape(1:64, 8, 8)
The result is the replicated array:
1 9 17 25 33 41 49 57
2 10 18 26 34 42 50 58
3 11 19 27 35 43 51 59
4 12 20 28 36 44 52 60
5 13 21 29 37 45 53 61
6 14 22 30 38 46 54 62
7 15 23 31 39 47 55 63
8 16 24 32 40 48 56 64Suppose you want to distribute this array among four labs, with a 4-by-4 block as the local part on each lab. In this case, the lab grid is a 2-by-2 arrangement of the labs, and the block size is a square of four elements on a side (i.e., each block is a 4-by-4 square). With this information, you can define the distributor object:
P>> DIST = distributor('2d', [2 2], 4)Now you can use this distributor object to distribute the original matrix:
P>> AA = distributed(A, DIST, 'convert')
This distributes the array among the labs according to this scheme:

If the lab grid does not perfectly overlay the dimensions of the distributed array, you can still use '2d' distribution, which is block cyclic. In this case, you can imagine the lab grid being repeatedly overlaid in both dimensions until all the original matrix elements are included.
Using the same original 8-by-8 matrix and 2-by-2 lab grid, consider a block size of 3 instead of 4, so that 3-by-3 square blocks are distributed among the labs. The code looks like this:
P>> DIST = distributor('2d', [2 2], 3)
P>> AA = distributed(A, DIST, 'convert')The first "row" of the lab grid is distributed to lab 1 and lab 2, but that contains only six of the eight columns of the original matrix. Therefore, the next two columns are distributed to lab 1. This process continues until all columns in the first rows are distributed. Then a similar process applies to the rows as you proceed down the matrix, as shown in the following distribution scheme:

The diagram above shows a scheme that requires four overlays of the lab grid to accommodate the entire original matrix. The following pmode session shows the code and resulting distribution of data to each of the labs:

The following points are worth noting:
'2d' distribution might not offer any performance enhancement unless the block size is at least a few dozen. The default block size is 64.
The lab grid should be as close to a square as possible.
Not all functions that are enhanced to work on '1d' distributed arrays work on '2d' distributed arrays.
![]() | Array Types | Using a for-Loop Over a Distributed Range (for-drange) | ![]() |
| © 1984-2008- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |