A convolutional neural network (CNN, or ConvNet), is an architecture commonly used for deep learning. CNNs are often used to recognize objects and scenes, and perform object detection and segmentation. They learn directly from image data, eliminating the need for manual feature extraction. The use of CNNs for deep learning has become increasing popular due to three important factors:
- It eliminates the need for manual feature extraction—the features are learned directly by the CNN.
- It produces state-of-the-art recognition results.
- CNNs can be retrained for new recognition tasks and allow for building on pre-existing networks.
How CNNs Work
A convolutional neural network can have tens or hundreds of layers that each learn to detect different features of an image. Filters are applied to each training image at different resolutions, and the output of each convolved image is used as the input to the next layer. The filters can start as very simple features, such as brightness and edges, and increase in complexity to features that uniquely define the object as the layers progress.
Hardware Acceleration with GPUs
A convolutional neural network is trained on hundreds, thousands, or even millions of images. When working with large amounts of data and complex network architectures, GPUs can significantly speed the processing time to train a model. Once a CNN is trained, it can be used in real-time applications, such as pedestrian detection in advanced driver assistance systems (ADAS).
CNNs for Object Recognition
One method for creating a convolutional neural network to perform object recognition is to train a network from scratch. The architect is required to define the number of layers, the learning weights, and number of filters, along with other tunable parameters. Training an accurate model from scratch also requires massive amounts of data, on the order of millions of samples, which can take an immense amount of time to train.
A common alternative to training a CNN from scratch is to use a pretrained model to automatically extract features from a new data set. This method, called transfer learning, is a convenient way to apply deep learning without a huge data set and long computation and training time.
Using MATLAB and Neural Network Toolbox enables you to train your own convolutional neural network from scratch or use a pretrained model to perform transfer learning.