## Key Features

• Parallel for-loops (parfor) for running task-parallel algorithms on multiple processors
• Support for CUDA-enabled NVIDIA GPUs
• Full use of multicore processors on the desktop via workers that run locally
• Computer cluster and grid support (with MATLAB Distributed Computing Server)
• Interactive and batch execution of parallel applications
• Distributed arrays and single program multiple data (spmd) construct for large dataset handling and data-parallel algorithms
Parallel computing with MATLAB. You can use Parallel Computing Toolbox to run applications on a multicore desktop with local workers available in the toolbox, take advantage of GPUs, and scale up to a cluster (with MATLAB Distributed Computing Server).

## Programming Parallel Applications

Parallel Computing Toolbox provides several high-level programming constructs that let you convert your applications to take advantage of computers equipped with multicore processors and GPUs. Constructs such as parallel for-loops (parfor) and special array types for distributed processing and for GPU computing simplify parallel code development by abstracting away the complexity of managing computations and data between your MATLAB session and the computing resource you are using.

You can run the same application on a variety of computing resources without reprogramming it. The parallel constructs function in the same way, regardless of the resource on which your application runs—a multicore desktop (using the toolbox) or on a larger resource such as a computer cluster (using toolbox with MATLAB Distributed Computing Server).

## Using Built-In Parallel Algorithms in Other MathWorks Products

Key functions in several MathWorks products have built-in parallel algorithms. In the presence of Parallel Computing Toolbox, these functions can distribute computations across available parallel computing resources, allowing you to speed up not just your MATLAB and Simulink based analysis or simulation tasks but also code generation for large Simulink models. You do not have to write any parallel code to take advantage of these functions.

Using built-in parallel algorithms in MathWorks products. Built-in parallel algorithms can speed up MATLAB and Simulink computations as well as code generation from Simulink models.

You can speed up some applications by organizing them into independent tasks (units of work) and executing multiple tasks concurrently. This class of task-parallel applications includes simulations for design optimization, BER testing, Monte Carlo simulations, and repetitive analysis on a large number of data files.

The toolbox offers parfor, a parallel for-loop construct that can automatically distribute independent tasks to multiple MATLAB workers (MATLAB computational engines running independently of your desktop MATLAB session). This construct automatically detects the presence of workers and reverts to serial behavior if none are present. You can also set up task execution using other methods, such as manipulating task objects in the toolbox.

Using parallel for-loops for a task-parallel application. You can use parallel for-loops in MATLAB scripts and functions and execute them both interactively and offline.

## Speeding Up MATLAB Computations with GPUs

Parallel Computing Toolbox provides GPUArray, a special array type with several associated functions that lets you perform computations on CUDA-enabled NVIDIA GPUs directly from MATLAB. Functions include fft, element-wise operations, and several linear algebra operations such as lu and mldivide, also known as the backslash operator (\). The toolbox also provides a mechanism that lets you use your existing CUDA-based GPU kernels directly from MATLAB.

GPU computing with MATLAB. Using GPUArrays and GPU-enabled MATLAB functions help speed up MATLAB operations without low-level CUDA programming.

## Scaling Up to Clusters, Grids, and Clouds Using MATLAB Distributed Computing Server

Parallel Computing Toolbox provides the ability to run MATLAB workers locally on your multicore desktop to execute your parallel applications allowing you to fully use the computational power of your desktop. Using the toolbox in conjunction with MATLAB Distributed Computing Server, you can run your applications on large scale computing resources such as computer clusters or grid and cloud computing resources

This session describes how Cornell University Bioacoustics Research Program data scientists use MATLAB ® to develop high-performance computing software to process and analyze terabytes of acoustic data.
Running a gene regulation model on a cluster using MATLAB Distributed Computing Server. The server enables applications developed using Parallel Computing Toolbox to harness computer clusters for large problems.

## Big Data Applications Using Parallel Computing Toolbox and MATLAB Distributed Computing Server

With Parallel Computing Toolbox and MATLAB Distributed Computing Server, you can analyze big data sets in parallel using distributed arrays, tall arrays, or mapreduce, on Apache Spark and Hadoop® clusters.

Distributed Arrays

Distributed arrays in Parallel Computing Toolbox support partitioning large matrices and multidimensional arrays across the combined memory of the nodes in a computer cluster. Using these distributed arrays, you can run big data applications that require simultaneous access to all elements of large matrices that are too large to fit into a single computer’s memory.

Over 400+ existing MATLAB functions are enhanced or overloaded to work with sparse and dense distributed arrays. Enhanced functions include linear algebra operations, such as mldivide(\), lu, and chol, and iterative solvers such as gmres, lsqr, cgs, and pcg. Using the enhanced functions, you can develop algorithms and interact with these arrays the same way you would with any MATLAB array and manipulate data available remotely on MATLAB workers* without low-level MPI programming.

You can load data into distributed arrays in parallel from a single file or a collection of files using datastore or directly construct distributed arrays on the MATLAB workers.

For fine-grained control over your parallelization scheme, the toolbox provides single program multiple data (spmd) construct and several message-passing routines based on an MPI standard library (MPICH2). The spmd construct lets you designate sections of your code to run concurrently across workers participating in a parallel computation.

Parallel Computing Support for Tall Arrays and MapReduce

Tall arrays, built into MATLAB, are used to work with out-of-memory data backed by a datastore that can have millions or billions of rows. As opposed to distributed arrays, tall arrays are loaded into memory in small chunks of data at a time, handling all of the data chunking and processing in the background. Parallel Computing Toolbox further extends tall arrays by running big data applications with tall arrays in parallel, using multiple local workers on your desktop computer. You can further scale up tall array calculations on a cluster, including Spark enabled Hadoop clusters, using MATLAB Distributed Computing Server.

Parallel Computing Toolbox also extends the MapReduce capabilities built into MATLAB so that you can run mapreduce based applications on local workers for improved performance. By pairing the toolbox with MATLAB Distributed Computing Server, you can scale these applications further by running mapreduce parallel on a Hadoop cluster.

* A worker is a MATLAB process that runs in the background. You start and control workers from a MATLAB session running Parallel Computing Toolbox.

## Running Parallel Applications Interactively and as Batch Jobs

You can execute parallel applications interactively and in batch using Parallel Computing Toolbox. Using the parpool command, you can connect your MATLAB session to a pool of MATLAB workers that can run either locally on your desktop (using the toolbox) or on a computer cluster (using MATLAB Distributed Computing Server) to setup a dedicated interactive parallel execution environment. You can execute parallel applications from the MATLAB prompt on these workers and retrieve results immediately as computations finish, just as you would in any MATLAB session.

Running applications interactively is suitable when execution time is relatively short. When your applications need to run for a long time, you can use the toolbox to set them up to run as batch jobs. This enables you to free your MATLAB session for other activities while you execute large MATLAB and Simulink applications.

While your application executes in batch, you can shut down your MATLAB session and retrieve results later. The toolbox provides several mechanisms to manage offline execution of parallel programs, such as the batch function and job and task objects. Both the batch function and the job and task objects can be used to offload the execution of serial MATLAB and Simulink applications from a desktop MATLAB session.

Running parallel applications interactively and as batch jobs. You can run applications on your workstation using local workers available with the toolbox, or on a computer cluster using more workers available with MATLAB Distributed Computing Server.