- Parallel
`for`

-loops (`parfor`

) for running task-parallel algorithms on multiple processors - Support for CUDA-enabled NVIDIA GPUs
- Full use of multicore processors on the desktop via workers that run locally
- Computer cluster and grid support (with MATLAB Distributed Computing Server)
- Interactive and batch execution of parallel applications
- Distributed arrays and single program multiple data (
`spmd`

) construct for large dataset handling and data-parallel algorithms

Parallel Computing Toolbox provides several high-level programming constructs that let you convert your applications to take advantage of computers equipped with multicore processors and GPUs. Constructs such as parallel `for`

-loops `(parfor)`

and special array types for distributed processing and for GPU computing simplify parallel code development by abstracting away the complexity of managing computations and data between your MATLAB session and the computing resource you are using.

You can run the same application on a variety of computing resources without reprogramming it. The parallel constructs function in the same way, regardless of the resource on which your application runs—a multicore desktop (using the toolbox) or on a larger resource such as a computer cluster (using toolbox with MATLAB Distributed Computing Server™).

You can speed up some applications by organizing them into independent *tasks* (units of work) and executing multiple tasks concurrently. This class of task-parallel applications includes simulations for design optimization, BER testing, Monte Carlo simulations, and repetitive analysis on a large number of data files.

The toolbox offers `parfor`

, a parallel `for`

-loop construct that can automatically distribute independent tasks to multiple MATLAB *workers* (MATLAB computational engines running independently of your desktop MATLAB session). This construct automatically detects the presence of workers and reverts to serial behavior if none are present. You can also set up task execution using other methods, such as manipulating `task`

objects in the toolbox.

Parallel Computing Toolbox provides GPUArray, a special array type with several associated functions that lets you perform computations on CUDA-enabled NVIDIA GPUs directly from MATLAB. Functions include `fft`

, element-wise operations, and several linear algebra operations such as `lu`

and `mldivide`

, also known as the backslash operator (\). The toolbox also provides a mechanism that lets you use your existing CUDA-based GPU kernels directly from MATLAB.

Learn more about GPU computing with MATLAB.

Parallel Computing Toolbox provides the ability to run MATLAB workers locally on your multicore desktop to execute your parallel applications allowing you to fully use the computational power of your desktop. Using the toolbox in conjunction with MATLAB Distributed Computing Server, you can run your applications on large scale computing resources such as computer clusters or grid and cloud computing resources

This session describes how Cornell University Bioacoustics Research Program data scientists use MATLAB ® to develop high-performance computing software to process and analyze terabytes of acoustic data.

With Parallel Computing Toolbox and MATLAB Distributed Computing Server, you can analyze big data sets in parallel using `distributed` arrays, `tall` arrays, or `mapreduce`, on Apache Spark^{™} and Hadoop^{®} clusters.

**Distributed Arrays**

Distributed arrays in Parallel Computing Toolbox support partitioning large matrices and multidimensional arrays across the combined memory of the nodes in a computer cluster. Using these distributed arrays, you can run big data applications that require simultaneous access to all elements of large matrices that are too large to fit into a single computer’s memory.

Over 400+ existing MATLAB functions are enhanced or overloaded to work with sparse and dense distributed arrays. Enhanced functions include linear algebra operations, such as `mldivide`(\), `lu`, and `chol`, and iterative solvers such as `gmres`, `lsqr`, `cgs`, and `pcg`. Using the enhanced functions, you can develop algorithms and interact with these arrays the same way you would with any MATLAB array and manipulate data available remotely on MATLAB workers* without low-level MPI programming.

You can load data into distributed arrays in parallel from a single file or a collection of files using datastore or directly construct distributed arrays on the MATLAB workers.

For fine-grained control over your parallelization scheme, the toolbox provides single program multiple data (`spmd`) construct and several message-passing routines based on an MPI standard library (MPICH2). The `spmd` construct lets you designate sections of your code to run concurrently across workers participating in a parallel computation.

**Parallel Computing Support for Tall Arrays and MapReduce**

Tall arrays, built into MATLAB, are used to work with out-of-memory data backed by a datastore that can have millions or billions of rows. As opposed to distributed arrays, tall arrays are loaded into memory in small chunks of data at a time, handling all of the data chunking and processing in the background. Parallel Computing Toolbox further extends tall arrays by running big data applications with tall arrays in parallel, using multiple local workers on your desktop computer. You can further scale up tall array calculations on a cluster, including Spark enabled Hadoop clusters, using MATLAB Distributed Computing Server.

Parallel Computing Toolbox also extends the MapReduce capabilities built into MATLAB so that you can run `mapreduce` based applications on local workers for improved performance. By pairing the toolbox with MATLAB Distributed Computing Server, you can scale these applications further by running `mapreduce` parallel on a Hadoop cluster.

* A worker is a MATLAB process that runs in the background. You start and control workers from a MATLAB session running Parallel Computing Toolbox.

You can execute parallel applications interactively and in batch using Parallel Computing Toolbox. Using the `parpool`

command, you can connect your MATLAB session to a pool of MATLAB workers that can run either locally on your desktop (using the toolbox) or on a computer cluster (using MATLAB Distributed Computing Server) to setup a dedicated interactive parallel execution environment. You can execute parallel applications from the MATLAB prompt on these workers and retrieve results immediately as computations finish, just as you would in any MATLAB session.

Running applications interactively is suitable when execution time is relatively short. When your applications need to run for a long time, you can use the toolbox to set them up to run as batch jobs. This enables you to free your MATLAB session for other activities while you execute large MATLAB and Simulink applications.

While your application executes in batch, you can shut down your MATLAB session and retrieve results later. The toolbox provides several mechanisms to manage offline execution of parallel programs, such as the `batch`

function and `job`

and `task`

objects. Both the `batch`

function and the `job`

and `task`

objects can be used to offload the execution of serial MATLAB and Simulink applications from a desktop MATLAB session.