Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

Pretrained Convolutional Neural Networks

Fine-tuning a pretrained network with transfer learning is typically much faster and easier than training from scratch. You can use previously trained networks for the following purposes.

PurposeDescription
Classification

Apply pretrained networks directly to classification problems. For an example showing how to use a pretrained network for classification, see Classify an Image Using AlexNet.

Transfer Learning

Take layers from a network trained on a large data set and fine-tune on a new data set. For an example showing how to use a pretrained network for transfer learning, see Transfer Learning Using AlexNet.

Feature Extraction

Use a pretrained network as a feature extractor by using the layer activations as features. You can use these activations as features to train another classifier, such as a support vector machine (SVM). For an example showing how to use a pretrained network for feature extraction, see Feature Extraction Using AlexNet.

Download Pretrained Networks

You can download and install pretrained networks to use for your problems. Use functions such as alexnet to get links to download pretrained networks from the Add-On Explorer. To learn more about finding and installing add-ons, see Get Add-Ons (MATLAB). To see a list of available downloads, see MathWorks Neural Network Toolbox Team.

The pretrained networks have learned rich feature representations for a wide range of natural images. You can apply these learned features to a wide range of image classification problems using transfer learning and feature extraction. The pretrained models are trained on more than a million images and can classify images into 1000 object categories, such as keyboard, coffee mug, pencil, and many animals. The training images are a subset of the ImageNet database [1], which is used in ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) [2].

AlexNet

Use the alexnet function to get a link to download a pretrained AlexNet model.

AlexNet won ILSVRC 2012, achieving highest classification performance. AlexNet has 8 layers with learnable weights: 5 convolutional layers, and 3 fully connected layers. AlexNet is fast for retraining and classifying new images, but it is also large and not as accurate on the original ILSVRC data set as other, newer pretrained models. For more information, see alexnet.

VGG-16 and VGG-19

Use the vgg16 and vgg19 functions to get links to download pretrained VGG models.

VGG networks were among the winners of ILSVRC 2014. VGG-16 has 16 layers with learnable weights: 13 convolutional layers and 3 fully connected layers. VGG-19 has 19 layers with learnable weights: 16 convolutional layers and 3 fully connected layers. In both networks, all convolutional layers have filters of size 3-by-3. VGG networks are larger and typically slower than other pretrained networks in Neural Network Toolbox™. For more information, see vgg16 and vgg19.

GoogLeNet

Use the googlenet function to get a link to download a pretrained GoogLeNet model.

GoogLeNet was among the winners of ILSVRC 2014. GoogLeNet is smaller and typically faster than VGG networks, and smaller and more accurate than AlexNet on the original ILSVRC data set. GoogLeNet is 22 layers deep. Like Inception-v3 and ResNets, GoogLeNet has a directed acyclic graph structure. To extract the layers and architecture of the network for further processing, use layerGraph. For a transfer learning example using GoogLeNet, see Transfer Learning Using GoogLeNet. For more information, see googlenet.

Inception-v3

To download and install a pretrained Inception-v3 network, use the Add-On Explorer. To learn more about finding and installing add-ons, see Get Add-Ons (MATLAB). You can also download the model from MathWorks Neural Network Toolbox Team. After you have installed the add-on, use the inceptionv3 function to load the network.

Inception-v3 is a development of the GoogLeNet architecture. Compared to GoogLeNet, Inception-v3 is larger, deeper, typically slower, but more accurate on the original ILSVRC data set. Inception-v3 is 48 layers deep. To extract the layers and architecture of the network for further processing, use layerGraph. To retrain the network on a new classification task, follow the steps of Transfer Learning Using GoogLeNet. Load the Inception-v3 model instead of GoogLeNet and change the names of the layers that you remove and connect to match the names of the Inception-v3 layers: remove the 'predictions', 'predictions_softmax', and 'ClassificationLayer_predictions' layers, and connect to the 'avg_pool' layer. For more information, see inceptionv3.

ResNet-50 and ResNet-101

To download and install pretrained ResNet-50 and ResNet-101 networks, use the Add-On Explorer. To learn more about finding and installing add-ons, see Get Add-Ons (MATLAB). You can also download the model from MathWorks Neural Network Toolbox Team. After you have installed the add-on, use the resnet50 and resnet101 functions to load the networks, respectively.

The residual connections of ResNets enable training of very deep networks. As the names suggest, ResNet-50 is 50 layers deep and ResNet-101 is 101 layers deep. To retrain a network on a new classification task, follow the steps of Transfer Learning Using GoogLeNet. Load a ResNet network instead of GoogLeNet and change the names of the layers in the end of the network that you remove and connect to match the names of the ResNet layers. To extract the layers and architecture of the network for further processing, use layerGraph. For more information, see resnet50 and resnet101.

importCaffeNetwork

You can import pretrained networks from Caffe [3] using the importCaffeNetwork.

There are many pretrained networks available in Caffe Model Zoo [4]. Locate and download the desired .prototxt and .caffemodel files and use importCaffeNetwork to import the pretrained network into MATLAB®. For more information, see importCaffeNetwork.

importCaffeLayers

You can import network architectures from Caffe using importCaffeLayers.

You can import the network architectures of Caffe networks, without importing the pretrained network weights. Locate and download the desired .prototxt file and use importCaffeLayers to import the network layers into MATLAB. For more information, see importCaffeLayers.

importKerasNetwork

You can import pretrained networks from Keras using importKerasNetwork.

You can import the network and weights either from the same HDF5 (.h5) file or separate HDF5 and JSON (.json) files. For more information, see importKerasNetwork.

importKerasLayers

You can import network architectures from Keras using importKerasLayers.

You can import the network architecture of Keras networks, either with or without weights. You can import the network architecture and weights either from the same HDF5 (.h5) file or separate HDF5 and JSON (.json) files. For more information, see importKerasLayers.

Transfer Learning

Transfer learning is commonly used in deep learning applications. You can take a pretrained network and use it as a starting point to learn a new task. Fine-tuning a network with transfer learning is much faster and easier than constructing and training a new network. You can quickly transfer learning to a new task using a smaller number of training images. The advantage of transfer learning is that the pretrained network has already learned a rich set of features. These features can be applied to a wide range of other similar tasks. For example, you can take a network trained on millions of images and retrain it for new object classification using only hundreds of images. If you have a very large data set, then transfer learning might not be faster. For examples showing how to perform transfer learning, see Transfer Learning Using AlexNet and Transfer Learning Using GoogLeNet.

Feature Extraction

Feature extraction is an easy way to use the power of pretrained networks without investing time and effort into training. Feature extraction can be the fastest way to use deep learning. You extract learned features from a pretrained network, and use those features to train a classifier, such as a support vector machine using fitcsvm (Statistics and Machine Learning Toolbox™). For example, if an SVM achieves >90% accuracy on your training and validation set, then fine-tuning might not be worth the effort to increase accuracy. If you perform fine-tuning on a small data set, you also risk over-fitting to the training data. If the SVM cannot achieve good enough accuracy for your application, then fine-tuning is worth the effort to seek higher accuracy. For an example showing how to use a pretrained network for feature extraction, see Feature Extraction Using AlexNet.

References

[1] ImageNet. http://www.image-net.org

[2] Russakovsky, O., Deng, J., Su, H., et al. “ImageNet Large Scale Visual Recognition Challenge.” International Journal of Computer Vision (IJCV). Vol 115, Issue 3, 2015, pp. 211–252

[3] Caffe. http://caffe.berkeleyvision.org/

[4] Caffe Model Zoo. http://caffe.berkeleyvision.org/model_zoo.html

[5] Keras. https://keras.io

See Also

| | | | | | | |

Related Topics

Was this topic helpful?