Neural network training and simulation involves many parallel calculations. Multicore CPUs, graphical processing units (GPUs), and clusters of computers with multiple CPUs and GPUs can all take advantage of parallel calculations.
Together, Neural Network Toolbox™ and Parallel Computing Toolbox™ enable the multiple CPU cores and GPUs of a single computer to speed up training and simulation of large problems.
The following is a standard single-threaded training and simulation session. (While the benefits of parallelism are most visible for large problems, this example uses a small dataset that ships with Neural Network Toolbox.)
[x,t] = house_dataset; net1 = feedforwardnet(10); net2 = train(net1,x,t); y = net2(x);
Intel® processors ship with as many as eight cores. Workstations with two processors can have as many as 16 cores, with even more possible in the future. Using multiple CPU cores in parallel can dramatically speed up calculations.
Start or get the current parallel pool and view the number of workers in the pool.
pool = gcp; pool.NumWorkers
An error occurs if you do not have a license for Parallel Computing Toolbox.
When a parallel pool is open, set the
'yes' to specify that training and simulation
be performed across the pool.
net2 = train(net1,x,t,'useParallel','yes'); y = net2(x,'useParallel','yes');
GPUs can have as many as 3072 cores on a single card, and possibly more in the future. These cards are highly efficient on parallel algorithms like neural networks.
gpuDeviceCount to check whether a supported
GPU card is available in your system. Use the function
review the currently selected GPU information or to select a different
gpuDeviceCount gpuDevice gpuDevice(2) % Select device 2, if available
An "Undefined function or variable" error appears if you do not have a license for Parallel Computing Toolbox.
When you have selected the GPU device, set the
'yes' to perform training and simulation on
net2 = train(net1,x,t,'useGPU','yes'); y = net2(x,'useGPU','yes');
You can use multiple GPUs for higher levels of parallelism.
After opening a parallel pool, set both
harness all the GPUs and CPU cores on a single computer. Each worker
associated with a unique GPU uses that GPU. The rest of the workers
perform calculations on their CPU core.
net2 = train(net1,x,t,'useParallel','yes','useGPU','yes'); y = net2(x,'useParallel','yes','useGPU','yes');
For some problems, using GPUs and CPUs together can result in
the highest computing speed. For other problems, the CPUs might not
keep up with the GPUs, and so using only GPUs is faster. Set
to restrict the parallel computing to workers with unique GPUs.
net2 = train(net1,x,t,'useParallel','yes','useGPU','only'); y = net2(x,'useParallel','yes','useGPU','only');
MATLAB® Distributed Computing Server™ allows you to harness all the CPUs and GPUs on a network cluster of computers. To take advantage of a cluster, open a parallel pool with a cluster profile. Use the MATLAB Home tab Environment area Parallel menu to manage and select profiles.
After opening a parallel pool, train the network by calling
net2 = train(net1,x,t,'useParallel','yes'); y = net2(x,'useParallel','yes'); net2 = train(net1,x,t,'useParallel','yes','useGPU','only'); y = net2(x,'useParallel','yes','useGPU','only');
For more information on parallel computing with Neural Network Toolbox, see Neural Networks with Parallel and GPU Computing, which introduces other topics, such as how to manually distribute data sets across CPU and GPU workers to best take advantage of differences in machine speed and memory.
Distributing data manually also allows worker data to load sequentially, so that data sets are limited in size only by the total RAM of a cluster instead of the RAM of a single computer. This lets you apply neural networks to very large problems.