## Documentation Center |

On this page… |
---|

You can use any of the Statistics Toolbox™ functions with Parallel Computing Toolbox constructs
such as `parfor` and `spmd`. However, some functions, such
as those with interactive displays, can lose functionality in parallel.
In particular, displays and interactive usage are not effective on
workers (see Vocabulary for Parallel Computation).

Additionally, the following functions are enhanced to use parallel
computing internally. These functions use `parfor` internally
to parallelize calculations.

This chapter gives the simplest way to use these enhanced functions
in parallel. For more advanced topics, including the issues of reproducibility
and nested `parfor` loops, see the other sections
in this chapter.

For information on parallel statistical computing at the command line, enter

help parallelstats

To have a function compute in parallel:

To run a statistical computation in parallel, first set up a parallel environment.

For a multicore machine, enter the following at the MATLAB^{®} command
line:

parpool()n

* n* is the number of workers you want
to use.

Create an options structure with the `statset` function.
To run in parallel, set the `UseParallel` option
to `true`:

paroptions = statset('UseParallel',true);

Call your function with syntax that uses the options structure. For example:

% Run crossval in parallel cvMse = crossval('mse',x,y,'predfun',regf,'Options',paroptions); % Run bootstrp in parallel sts = bootstrp(100,@(x)[mean(x) std(x)],y,'Options',paroptions); % Run TreeBagger in parallel b = TreeBagger(50,meas,spec,'OOBPred','on','Options',paroptions);

For more complete examples of parallel statistical functions, see Parallel Treebagger and Examples of Parallel Statistical Functions.

After you have finished computing in parallel, close the parallel environment:

delete mypool

To run the example Regression of Insurance Risk Rating for Car Imports Using TreeBagger in parallel:

Set up the parallel environment to use two cores:

mypool = parpool(2) Starting parpool using the 'local' profile ... connected to 2 workers. mypool = Pool with properties: AttachedFiles: {0x1 cell} NumWorkers: 2 Cluster: [1x1 parallel.cluster.Local] SpmdEnabled: 1

Set the options to use parallel processing:

paroptions = statset('UseParallel',true);

Load the problem data and separate it into input and response:

load imports-85; Y = X(:,1); X = X(:,2:end);

Estimate feature importance using leaf size

`1`and`1000`trees in parallel. Time the function for comparison purposes:tic b = TreeBagger(1000,X,Y,'Method','r','OOBVarImp','on',... 'cat',16:25,'MinLeaf',1,'Options',paroptions); toc Elapsed time is 16.696336 seconds.

Perform the same computation in serial for timing comparison:

tic b = TreeBagger(1000,X,Y,'Method','r','OOBVarImp','on',... 'cat',16:25,'MinLeaf',1); % No options gives serial toc Elapsed time is 21.747950 seconds.

Computing in parallel took about 75% of the time of computing serially.

Was this topic helpful?