Deep learning is a branch of machine learning that teaches computers to do what comes naturally to humans: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. Deep learning is especially suited for image recognition, which is important for solving problems such as facial recognition, motion detection, and many advanced driver assistance technologies such as autonomous driving, lane detection, pedestrian detection, and autonomous parking.
Neural Network Toolbox™ provides simple MATLAB® commands for creating and interconnecting the layers of a deep neural network. Examples and pretrained networks make it easy to use MATLAB for deep learning, even without knowledge of advanced computer vision algorithms or neural networks.
|What do you want to do?||Learn More|
|Perform transfer learning to fine-tune a network with your data||Start Deep Learning Faster Using Transfer Learning|
Fine-tuning a pretrained network to learn a new task is typically much faster and easier than training a new network.
|Get pretrained networks to explore and use to classify images||Pretrained Convolutional Neural Networks|
|Create a new deep neural network to perform classification or regression|
|Resize, rotate, or preprocess images for training or prediction||Preprocess Images for Deep Learning|
|Label your image data automatically based on folder names, or interactively using an app|
|Train a network to classify each pixel of an image (for example, road, car, pedestrian)||Semantic Segmentation Basics (Computer Vision System Toolbox)|
|Train a network for object detection||Deep Learning, Object Detection and Recognition (Computer Vision System Toolbox)|
|Visualize what features networks have learned|
|Train on CPU, GPU, multiple GPUs, in parallel on your desktop or on clusters in the cloud, and work with data sets too large to fit in memory||Deep Learning with Big Data on GPUs and in Parallel|
To choose whether to use a pretrained network or create a new deep network, consider the scenarios in this table.
|Use a Pretrained Network for Transfer Learning||Create a New Deep Network|
|Training Data||Hundreds to thousands of labeled images (small)||Thousands to millions of labeled images|
|Computation||Moderate computation (GPU optional)||Compute intensive (requires GPU for speed)|
|Training Time||Seconds to minutes||Days to weeks for real problems|
|Model Accuracy||Good, depends on the pretrained model||High, but can overfit to small data sets|
Deep learning uses neural networks to learn useful representations of features directly from data. Neural networks combine multiple nonlinear processing layers, using simple elements operating in parallel and inspired by biological nervous systems. Deep learning models can achieve state-of-the-art accuracy in object classification, sometimes exceeding human-level performance.
You train models using a large set of labeled data and neural network architectures that contain many layers, usually including some convolutional layers. Training these models is computationally intensive and you can usually accelerate training by using a high performance GPU. This diagram shows how convolutional neural networks combine layers that automatically learn features from many images in order to classify new images.
Many deep learning applications use image files, and sometimes
millions of image files. To efficiently access many image files for
deep learning, MATLAB provides the
Use this function to:
Automatically read batches of images for faster processing in machine learning and computer vision applications
Import data from image collections that are too large to fit in memory
Label your image data automatically based on folder names
This example shows how to use deep learning to identify objects on a live webcam using only 10 lines of MATLAB code. Try the example to see how simple it is to get started with deep learning in MATLAB.
Run these commands to get the downloads if needed, connect to the webcam, and get a pretrained neural network.
camera = webcam; % Connect to the camera net = alexnet; % Load the neural network
If you need to install the
add-ons, a message from each function appears with a link to help you download
the free add-ons using Add-On Explorer. Alternatively, see Neural Network
Toolbox Model for AlexNet Network and
MATLAB Support Package for USB Webcams.
After you install Neural Network Toolbox Model for AlexNet Network, you can use it to classify images. AlexNet is a pretrained convolutional neural network (CNN) that has been trained on more than a million images and can classify images into 1000 object categories (for example, keyboard, mouse, coffee mug, pencil, and many animals).
Run the following code to show and classify live images.
Point the webcam at an object and the neural network reports what
class of object it thinks the webcam is showing. It will keep classifying
images until you press Ctrl+C. The
code resizes the image for the network using
while true im = snapshot(camera); % Take a picture image(im); % Show the picture im = imresize(im,[227 227]); % Resize the picture for alexnet label = classify(net,im); % Classify the picture title(char(label)); % Show the class label drawnow end
In this example, the network correctly classifies a coffee mug. Experiment with objects in your surroundings to see how accurate the network is.
To watch a video of this example, see Deep Learning in 11 Lines of MATLAB Code.
To get the code to extend this example to show the probability scores of classes, see Classify Webcam Images Using Deep Learning.
For next steps in deep learning, you can use the pretrained network for other tasks. Solve new classification problems on your image data with transfer learning or feature extraction. For examples, see Start Deep Learning Faster Using Transfer Learning and Train Classifiers Using Features Extracted from Pretrained Networks. To try other pretrained networks, see Pretrained Convolutional Neural Networks.
Transfer learning is commonly used in deep learning applications. You can take a pretrained network and use it as a starting point to learn a new task. Fine-tuning a network with transfer learning is much faster and easier than training from scratch. You can quickly make the network learn a new task using a smaller number of training images. The advantage of transfer learning is that the pretrained network has already learned a rich set of features that can be applied to a wide range of other similar tasks.
For example, if you take a network trained on thousands or millions of images, you can retrain it for new object detection using only hundreds of images. You can effectively fine-tune a pretrained network with much smaller data sets than the original training data. Transfer learning might not be faster than training a new network if you have a very large dataset.
Transfer learning enables you to:
Transfer the learned features of a pretrained network to a new problem
Transfer learning is faster and easier than training a new network
Reduce training time and dataset size
Perform deep learning without needing to learn how to create a whole new network
Feature extraction allows you to use the power of pretrained
networks without investing time and effort into training. Feature
extraction can be the fastest way to use deep learning. You extract
learned features from a pretrained network, and use those features
to train a classifier, for example, a support vector machine (SVM
— requires Statistics and Machine
Learning Toolbox™). For example, if an SVM
alexnet can achieve >90% accuracy
on your training and validation set, then fine-tuning with transfer
learning might not be worth the effort to gain some extra accuracy.
You also risk overfitting the training data if you perform fine-tuning
on a small dataset. If the SVM cannot achieve good enough accuracy
for your application, then fine-tuning is worth the effort to seek
For an example, see Feature Extraction Using AlexNet.
Neural networks are inherently parallel algorithms. You can take advantage of this parallelism by using Parallel Computing Toolbox™ to distribute training across multicore CPUs, graphical processing units (GPUs), and clusters of computers with multiple CPUs and GPUs.
Training deep networks is extremely computationally intensive and you can usually accelerate training by using a high performance GPU. If you do not have a suitable GPU, you can train on one or more CPU cores instead. You can train a convolutional neural network on a single GPU or CPU, or on multiple GPUs or CPU cores, or in parallel on a cluster. Using GPU or parallel options requires Parallel Computing Toolbox.
You do not need multiple computers to solve problems using data
sets too big to fit in memory. You can use the
to work with batches of data without needing a cluster of machines.
However, if you have a cluster available, it can be helpful to take
your code to the data repository rather than moving large amounts
of data around.
To learn more about deep learning hardware and memory settings, see Deep Learning with Big Data on GPUs and in Parallel.