- Deep learning with convolutional neural networks (CNNs), long short term memory (LSTM) networks (for time-series classification), and autoencoders (for feature learning)
- Directed acyclic graph (DAG) networks for deep learning with complex architectures
- Transfer learning with pretrained CNN models (GoogLeNet, AlexNet, vgg16, vgg19) and models from the Caffe Model Zoo
- Training and inference with CPUs or multiple GPUs on desktops, clusters, and clouds (including Amazon EC2
^{®}P2) - Unsupervised learning algorithms, including self-organizing maps and competitive layers
- Supervised learning algorithms, including multilayer, radial basis, learning vector quantization (LVQ), time-delay, nonlinear autoregressive (NARX), and recurrent neural network (RNN)
- Apps for data fitting, pattern recognition, and clustering

Convolutional neural networks (CNNs) eliminate the need for manual feature extraction by learning features directly from images. This automated feature extraction with CNN models simplifies computer vision tasks such as object classification. Neural Network Toolbox provides functions for constructing and training CNNs, as well as making predictions with a trained CNN model.

Directed acyclic graph (DAG) networks allow more complex deep learning topologies that are typically much deeper and more accurate than series network topologies. You can use DAG networks to create deep learning network models with skipped layers or layers connected in parallel.

An LSTM network is a type of recurrent neural network (RNN) that can learn long-term dependencies between the time steps of sequence data. You can use LSTM networks for classifying and predicting time-series data, as well as for predicting text data using Text Analytics Toolbox™.

Autoencoders can be used for unsupervised feature transformation by extracting low-dimensional features from your data set. You can also use autoencoders for supervised learning by training several encoders and stacking them as a deep network to increase classification accuracy.

Transfer learning is commonly used in deep learning applications. With Neural Network Toolbox, you can use pretrained neural network model as a starting point to learn a new task. Fine-tuning a network with transfer learning is usually faster and easier than training from scratch. You can quickly transfer learned features to a new task using a smaller number of training images.

You can use pretrained deep neural network models to quickly apply deep learning to your problems by performing transfer learning or feature extraction. Available models include GoogLeNet, AlexNet, VGG-16, and VGG-19, as well as Caffe models (e.g., from Caffe Model Zoo).

A CNN model learns feature representations from your data during the training process. You can visualize what the learned features look like by using the `deepDreamImage`

function to generate images that strongly activate a particular channel of the network layers.

For deep neural networks, Neural Network Toolbox in conjunction with Parallel Computing Toolbox offer built-in GPU support to minimize training time. Training deep networks is computationally intensive, and you can usually accelerate training by using high-performance GPUs. You can train a convolutional neural network on either a single GPU, multiple GPUs, or in parallel on a GPU cluster. MATLAB^{®} supports most CUDA-enabled NVIDIA GPUs that have compute capability 3.0 or higher for training deep neural networks. You can speed up deep learning by using the Cloud Center to run MATLAB on Amazon EC2^{®} machines.

For shallow neural networks, Parallel Computing Toolbox enables Neural Network Toolbox simulation and training to be parallelized across the multiprocessors and cores of a general-purpose GPU. GPUs are highly efficient on parallel algorithms such as neural networks. You can achieve higher levels of parallelism by using multiple GPUs or GPUs and processors together. With MATLAB Distributed Computing Server™, you can harness all the processors and GPUs on a network cluster of computers for neural network training and simulation. Learn more about GPU computing with MATLAB.

Parallel Computing Toolbox lets shallow neural network training and simulation run across multiple processor cores on a single PC, or across multiple processors on multiple computers on a network using MATLAB Distributed Computing Server. Using multiple cores can speed up calculations. Using multiple computers enables you to solve problems using data sets too big to fit within the system memory of any single computer. The only limit to problem size is the total system memory available across all computers.

Supervised neural networks are trained to produce desired outputs in response to sample inputs, making them particularly well suited for modeling and controlling dynamic systems, classifying noisy data, and predicting future events. Neural Network Toolbox includes four types of supervised networks: feedforward, radial basis, dynamic, and learning vector quantization.

**Feedforward networks **have one-way connections from input to output layers. They are most commonly used for prediction, pattern recognition, and nonlinear function fitting. Supported feedforward networks include feedforward backpropagation, cascade-forward backpropagation, feedforward input-delay backpropagation, linear, and perceptron networks.

**Radial basis networks** provide an alternative, fast method for designing nonlinear feedforward networks. Supported variations include generalized regression and probabilistic neural networks.

**Dynamic networks** use memory and recurrent feedback connections to recognize spatial and temporal patterns in data. They are commonly used for time-series prediction, nonlinear dynamic system modeling, and control systems applications. Prebuilt dynamic networks in the toolbox include focused and distributed time-delay, nonlinear autoregressive (NARX), layer-recurrent, Elman, and Hopfield networks. The toolbox also supports dynamic training of custom networks with arbitrary connections.

**Learning vector quantization (LVQ) networks** use a method for classifying patterns that are not linearly separable. LVQ lets you specify class boundaries and the granularity of classification.

Unsupervised neural networks are trained by letting the network continually adjust itself to new inputs. They find relationships within data and can automatically define classification schemes. Neural Network Toolbox includes two types of self-organizing, unsupervised networks: competitive layers and self-organizing maps.

**Competitive layers** recognize and group similar input vectors, enabling them to automatically sort inputs into categories. Competitive layers are commonly used for classification and pattern recognition.

**Self-organizing maps** learn to classify input vectors according to similarity. Like competitive layers, they are used for classification and pattern recognition tasks; however, they differ from competitive layers because they are able to preserve the topology of the input vectors, assigning nearby inputs to nearby categories.

Training and learning functions are mathematical procedures used to automatically adjust the network's weights and biases. The training function dictates a global algorithm that affects all the weights and biases of a given network. The learning function can be applied to individual weights and biases within a network.

Neural Network Toolbox supports a variety of training algorithms for shallow neural networks, including several gradient descent methods, conjugate gradient methods, the Levenberg-Marquardt algorithm (LM), and the resilient backpropagation algorithm (Rprop). The toolbox’s modular framework lets you quickly develop custom training algorithms that can be integrated with built-in algorithms. While training your neural network, you can use error weights to define the relative importance of desired outputs, which can be prioritized in terms of sample, time step (for time- series problems), output element, or any combination of these. You can access training algorithms from the command line or via apps that show diagrams of the network being trained and provide network performance plots and status information to help you monitor the training process.

A suite of learning functions, including gradient descent, Hebbian learning, LVQ, Widrow-Hoff, and Kohonen is also provided.

Preprocessing the network inputs and targets improves the efficiency of shallow neural network training. Postprocessing enables detailed analysis of network performance. Neural Network Toolbox provides preprocessing and postprocessing functions and Simulink^{®} blocks that enable you to:

- Reduce the dimensions of the input vectors using principal component analysis
- Perform regression analysis between the network response and the corresponding targets
- Scale inputs and targets so they fall in the range [-1,1]
- Normalize the mean and standard deviation of the training set
- Use automated data preprocessing and data division when creating your networks

Improving the network’s ability to generalize helps prevent overfitting, a common problem in neural network design. Overfitting occurs when a network has memorized the training set but has not learned to generalize to new inputs. Overfitting produces a relatively small error on the training set but a much larger error when new data is presented to the network.

Neural Network Toolbox provides two solutions to improve generalization:

**Regularization**modifies the network’s performance function (the measure of error that the training process minimizes). By including the sizes of the weights and biases, regularization produces a network that performs well with the training data and exhibits smoother behavior when presented with new data.**Early stopping**uses two different data sets: the training set, to update the weights and biases, and the validation set, to stop training when the network begins to overfit the data.

You can use GPU Coder to deploy trained deep learning networks on NVIDIA GPUs such as the Tesla and Tegra. GPU Coder generates code for pre-processing and post-processing along with the deep learning network so you develop your complete application more easily.

You can use MATLAB Coder to generate C and C++ code for your trained network, allowing you to simulate a trained network on PC hardware and then deploy the network to embedded systems.

You can use MATLAB Compiler and MATLAB Compiler SDK™ products to deploy trained networks as C/C++ shared libraries, Microsoft^{®} .NET assemblies, Java^{®} classes, and Python^{®} packages from MATLAB programs. You can also train a network model in the deployed application or component.

Neural Network Toolbox provides a set of blocks for building shallow neural networks in Simulink. All blocks are compatible with Simulink Coder™. These blocks are divided into four libraries:

**Transfer function blocks**, which take a net input vector and generate a corresponding output vector**Net input function blocks**, which take any number of weighted input vectors, weight-layer output vectors, and bias vectors, and return a net input vector**Weight function blocks**, which apply a neuron's weight vector to an input vector (or a layer output vector) to get a weighted input value for a neuron**Data preprocessing blocks**, which map input and output data into the ranges best suited for the neural network to handle directly

Alternatively, you can create and train your networks in the MATLAB environment and automatically generate network simulation blocks for use with Simulink. This approach also enables you to view your networks graphically.

You can apply shallow neural networks to the identification and control of nonlinear systems. The toolbox includes descriptions, examples, and Simulink blocks for three popular control applications:

**Model predictive control**, which uses a neural network model to predict future plant responses to potential control signals. An optimization algorithm then computes the control signals that optimize future plant performance. The neural network plant model is trained offline and in batch form.**Feedback linearization**, which uses a rearrangement of the neural network plant model and is trained offline. This controller requires the least computation of these three architectures; however, the plant must either be in companion form or capable of approximation by a companion form model.**Model reference adaptive control**, which requires that a separate neural network controller be trained offline, in addition to the neural network plant model. While the controller training is computationally expensive, the model reference control applies to a larger class of plant than feedback linearization.

You can incorporate neural network predictive control blocks included in the toolbox into your Simulink models. By changing the parameters of these blocks, you can tailor the network's performance to your application.