Key Features

  • Deep learning with convolutional neural networks (CNNs), long short term memory (LSTM) networks (for time-series classification), and autoencoders (for feature learning)
  • Directed acyclic graph (DAG) networks for deep learning with complex architectures
  • Transfer learning with pretrained CNN models (GoogLeNet, AlexNet, vgg16, vgg19) and models from the Caffe Model Zoo
  • Training and inference with CPUs or multiple GPUs on desktops, clusters, and clouds (including Amazon EC2® P2)
  • Unsupervised learning algorithms, including self-organizing maps and competitive layers
  • Supervised learning algorithms, including multilayer, radial basis, learning vector quantization (LVQ), time-delay, nonlinear autoregressive (NARX), and recurrent neural network (RNN)
  • Apps for data fitting, pattern recognition, and clustering
Watch this series of MATLAB Tech Talks to explore key deep learning concepts. Learn to identify when to use deep learning, discover what approaches are suitable for your application, and explore some of the challenges you might encounter.

Deep Learning Networks and Algorithms

Deep learning algorithms can learn discriminative features directly from data such as images, text, and signals. These algorithms can be used to build highly accurate classifiers when trained on large labeled training data sets. Neural Network Toolbox™ supports training convolutional neural network and autoencoder deep learning algorithms for image classification, regression, and feature learning tasks.

Convolutional Neural Networks

Convolutional neural networks (CNNs) eliminate the need for manual feature extraction by learning features directly from images. This automated feature extraction with CNN models simplifies computer vision tasks such as object classification. Neural Network Toolbox provides functions for constructing and training CNNs, as well as making predictions with a trained CNN model.

Directed Acyclic Graph (DAG) Networks

Directed acyclic graph (DAG) networks allow more complex deep learning topologies that are typically much deeper and more accurate than series network topologies. You can use DAG networks to create deep learning network models with skipped layers or layers connected in parallel.

Create and train a directed acyclic graph (DAG) network to classify images of digits.

Row A depicts a deep learning network with layers connected in series. DAG networks are more general and support more complex topologies. These can include ResNet (row B), where we layers are skipped and GoogLeNet (row C), where layers are connected in parallel.

Long Short Term Memory (LSTM) Networks

An LSTM network is a type of recurrent neural network (RNN) that can learn long-term dependencies between the time steps of sequence data. You can use LSTM networks for classifying and predicting time-series data, as well as for predicting text data using Text Analytics Toolbox™.

Train an LTSM to recognize the speaker given time-series data representing an utterance of two Japanese vowels in succession.

Stacked Autoencoders

Autoencoders can be used for unsupervised feature transformation by extracting low-dimensional features from your data set. You can also use autoencoders for supervised learning by training several encoders and stacking them as a deep network to increase classification accuracy.

Deep Learning Training, Pretrained Models, and Visualization

You can create deep neural networks by defining the layer architecture and learning options. There are two ways to train networks. Training from scratch can be more accurate but requires large amounts of labeled data. Alternatively, you can use transfer learning to fine-tune a pretrained network model to work with your data. Transfer learning is typically faster and easier than training a deep learning model from scratch.

Transfer Learning

Transfer learning is commonly used in deep learning applications. With Neural Network Toolbox, you can use pretrained neural network model as a starting point to learn a new task. Fine-tuning a network with transfer learning is usually faster and easier than training from scratch. You can quickly transfer learned features to a new task using a smaller number of training images.

Fine-tune a pretrained AlexNet convolutional neural network to perform classification on a new collection of images.

Pretrained Models

You can use pretrained deep neural network models to quickly apply deep learning to your problems by performing transfer learning or feature extraction. Available models include GoogLeNet, AlexNet, VGG-16, and VGG-19, as well as Caffe models (e.g., from Caffe Model Zoo).


A CNN model learns feature representations from your data during the training process. You can visualize what the learned features look like by using the deepDreamImage function to generate images that strongly activate a particular channel of the network layers.

Accelerated Training with GPUs and Large Data Sets

You can speed up neural network training and simulation of large data sets by using Neural Network Toolbox with Parallel Computing Toolbox™. Training and simulation involve many parallel computations, which can be accelerated with multicore processors, CUDA-enabled NVIDIA graphics processing units (GPUs), and computer clusters with multiple processors and GPUs.

GPU Computing

For deep neural networks, Neural Network Toolbox in conjunction with Parallel Computing Toolbox offer built-in GPU support to minimize training time. Training deep networks is computationally intensive, and you can usually accelerate training by using high-performance GPUs. You can train a convolutional neural network on either a single GPU, multiple GPUs, or in parallel on a GPU cluster. MATLAB® supports most CUDA-enabled NVIDIA GPUs that have compute capability 3.0 or higher for training deep neural networks. You can speed up deep learning by using the Cloud Center to run MATLAB on Amazon EC2® machines.

Learn how the parallel performance of GPU instances scales with the number of workers.

For shallow neural networks, Parallel Computing Toolbox enables Neural Network Toolbox simulation and training to be parallelized across the multiprocessors and cores of a general-purpose GPU. GPUs are highly efficient on parallel algorithms such as neural networks. You can achieve higher levels of parallelism by using multiple GPUs or GPUs and processors together. With MATLAB Distributed Computing Server™, you can harness all the processors and GPUs on a network cluster of computers for neural network training and simulation. Learn more about GPU computing with MATLAB.

Distributed Computing

Parallel Computing Toolbox lets shallow neural network training and simulation run across multiple processor cores on a single PC, or across multiple processors on multiple computers on a network using MATLAB Distributed Computing Server. Using multiple cores can speed up calculations. Using multiple computers enables you to solve problems using data sets too big to fit within the system memory of any single computer. The only limit to problem size is the total system memory available across all computers.

Classification, Regression, and Clustering of Shallow Networks

Neural Network Toolbox includes command-line functions and apps for creating, training, and simulating shallow neural networks. The apps make it easy to develop neural networks for tasks such as classification, regression (including time-series regression), and clustering. After creating your networks in these tools, you can automatically generate MATLAB code to capture your work and automate tasks.

Identify the winery that particular wines came from based on chemical attributes of the wine.
Cluster iris flowers based on petal and sepal size.

Shallow Network Architectures

You can use neural networks with a variety of supervised and unsupervised shallow network architectures. With the toolbox’s modular approach to building networks, you can also develop custom network architectures for your specific problem. You can view the network architecture including all inputs, layers, outputs, and interconnections.

Supervised Networks

Supervised neural networks are trained to produce desired outputs in response to sample inputs, making them particularly well suited for modeling and controlling dynamic systems, classifying noisy data, and predicting future events. Neural Network Toolbox includes four types of supervised networks: feedforward, radial basis, dynamic, and learning vector quantization.

Feedforward networks have one-way connections from input to output layers. They are most commonly used for prediction, pattern recognition, and nonlinear function fitting. Supported feedforward networks include feedforward backpropagation, cascade-forward backpropagation, feedforward input-delay backpropagation, linear, and perceptron networks.

Radial basis networks provide an alternative, fast method for designing nonlinear feedforward networks. Supported variations include generalized regression and probabilistic neural networks.

Dynamic networks use memory and recurrent feedback connections to recognize spatial and temporal patterns in data. They are commonly used for time-series prediction, nonlinear dynamic system modeling, and control systems applications. Prebuilt dynamic networks in the toolbox include focused and distributed time-delay, nonlinear autoregressive (NARX), layer-recurrent, Elman, and Hopfield networks. The toolbox also supports dynamic training of custom networks with arbitrary connections.

Learning vector quantization (LVQ) networks use a method for classifying patterns that are not linearly separable. LVQ lets you specify class boundaries and the granularity of classification.

Model the position of a levitated magnet as current passes through an electromagnet beneath it.

Unsupervised Networks

Unsupervised neural networks are trained by letting the network continually adjust itself to new inputs. They find relationships within data and can automatically define classification schemes. Neural Network Toolbox includes two types of self-organizing, unsupervised networks: competitive layers and self-organizing maps.

Competitive layers recognize and group similar input vectors, enabling them to automatically sort inputs into categories. Competitive layers are commonly used for classification and pattern recognition.

Self-organizing maps learn to classify input vectors according to similarity. Like competitive layers, they are used for classification and pattern recognition tasks; however, they differ from competitive layers because they are able to preserve the topology of the input vectors, assigning nearby inputs to nearby categories.

Training Algorithms

Training and learning functions are mathematical procedures used to automatically adjust the network's weights and biases. The training function dictates a global algorithm that affects all the weights and biases of a given network. The learning function can be applied to individual weights and biases within a network.

Neural Network Toolbox supports a variety of training algorithms for shallow neural networks, including several gradient descent methods, conjugate gradient methods, the Levenberg-Marquardt algorithm (LM), and the resilient backpropagation algorithm (Rprop). The toolbox’s modular framework lets you quickly develop custom training algorithms that can be integrated with built-in algorithms. While training your neural network, you can use error weights to define the relative importance of desired outputs, which can be prioritized in terms of sample, time step (for time- series problems), output element, or any combination of these. You can access training algorithms from the command line or via apps that show diagrams of the network being trained and provide network performance plots and status information to help you monitor the training process.

A suite of learning functions, including gradient descent, Hebbian learning, LVQ, Widrow-Hoff, and Kohonen is also provided.

Neural network apps that automate training your neural network to fit input and target data (left), monitor training progress (right), and calculate statistical results and plots to assess training quality.

Preprocessing, Postprocessing, and Improving Generalization

Preprocessing the network inputs and targets improves the efficiency of shallow neural network training. Postprocessing enables detailed analysis of network performance. Neural Network Toolbox provides preprocessing and postprocessing functions and Simulink® blocks that enable you to:

  • Reduce the dimensions of the input vectors using principal component analysis
  • Perform regression analysis between the network response and the corresponding targets
  • Scale inputs and targets so they fall in the range [-1,1]
  • Normalize the mean and standard deviation of the training set
  • Use automated data preprocessing and data division when creating your networks

Improving the network’s ability to generalize helps prevent overfitting, a common problem in neural network design. Overfitting occurs when a network has memorized the training set but has not learned to generalize to new inputs. Overfitting produces a relatively small error on the training set but a much larger error when new data is presented to the network.

Neural Network Toolbox provides two solutions to improve generalization:

  • Regularization modifies the network’s performance function (the measure of error that the training process minimizes). By including the sizes of the weights and biases, regularization produces a network that performs well with the training data and exhibits smoother behavior when presented with new data.
  • Early stopping uses two different data sets: the training set, to update the weights and biases, and the validation set, to stop training when the network begins to overfit the data.
Postprocessing plots to analyze network performance, including mean squared error validation performance for successive training epochs (top left), error histogram (top right), and confusion matrices (bottom) for training, validation, and test phases.

Code Generation and Deployment

By using Neural Network Toolbox with MATLAB Coder™, GPU Coder™, and MATLAB Compiler™ products, you can deploy trained networks to embedded systems or integrate them with a wide range of production environments.

Learn how to make joint use of the signal processing and machine learning techniques available in MATLAB to develop data analytics for time series and sensor processing systems

GPU Coder Support

You can use GPU Coder to deploy trained deep learning networks on NVIDIA GPUs such as the Tesla and Tegra. GPU Coder generates code for pre-processing and post-processing along with the deep learning network so you develop your complete application more easily.

Run generated CUDA on NVIDIA GPUs such as the Tesla and Tegra.

MATLAB Coder Support

You can use MATLAB Coder to generate C and C++ code for your trained network, allowing you to simulate a trained network on PC hardware and then deploy the network to embedded systems.

MATLAB Compiler Support

You can use MATLAB Compiler and MATLAB Compiler SDK™ products to deploy trained networks as C/C++ shared libraries, Microsoft® .NET assemblies, Java® classes, and Python® packages from MATLAB programs.  You can also train a network model in the deployed application or component.

Simulink Support

Neural Network Toolbox provides a set of blocks for building shallow neural networks in Simulink. All blocks are compatible with Simulink Coder™. These blocks are divided into four libraries:

  • Transfer function blocks, which take a net input vector and generate a corresponding output vector
  • Net input function blocks, which take any number of weighted input vectors, weight-layer output vectors, and bias vectors, and return a net input vector
  • Weight function blocks, which apply a neuron's weight vector to an input vector (or a layer output vector) to get a weighted input value for a neuron
  • Data preprocessing blocks, which map input and output data into the ranges best suited for the neural network to handle directly

Alternatively, you can create and train your networks in the MATLAB environment and automatically generate network simulation blocks for use with Simulink. This approach also enables you to view your networks graphically.

Control Systems Applications

You can apply shallow neural networks to the identification and control of nonlinear systems. The toolbox includes descriptions, examples, and Simulink blocks for three popular control applications:

  • Model predictive control, which uses a neural network model to predict future plant responses to potential control signals. An optimization algorithm then computes the control signals that optimize future plant performance. The neural network plant model is trained offline and in batch form.
  • Feedback linearization, which uses a rearrangement of the neural network plant model and is trained offline. This controller requires the least computation of these three architectures; however, the plant must either be in companion form or capable of approximation by a companion form model.
  • Model reference adaptive control, which requires that a separate neural network controller be trained offline, in addition to the neural network plant model. While the controller training is computationally expensive, the model reference control applies to a larger class of plant than feedback linearization.

You can incorporate neural network predictive control blocks included in the toolbox into your Simulink models. By changing the parameters of these blocks, you can tailor the network's performance to your application.