Transfer Learning

Deep learning approach to train new models faster by using pretrained models

Transfer learning is a deep learning approach in which a model that has been trained for one task is used as a starting point to train a model for similar task. Fine-tuning a network with transfer learning is usually much faster and easier than training a network from scratch. The approach is commonly used for object detection, image recognition, speech recognition, and other applications.

Transfer learning is a popular technique because:

  • It enables you to train models using relatively little labeled data by leveraging popular models that have already been trained on large datasets.
  • It can dramatically reduce training time and compute resources. With transfer learning, the model does not need to be trained for as many epochs (a full training cycle on the entire dataset) as a new model would require.

Training from Scratch vs. Transfer Learning

The two commonly used approaches for deep learning are training a model from scratch and transfer learning. Both approaches have benefits and can be used for different deep learning tasks.

Developing and training a model from scratch works better for very specific tasks for which preexisting models cannot be used. The downside is this approach typically requires a large amount of data to produce accurate results—for example, when text needs to be verified and a large number of data samples are available. If you don’t have access to a pretrained model for a text analysis task, developing a model from scratch is recommended.

Transfer learning is useful for tasks such object recognition, for which a variety of popular pretrained models, such as AlexNet and GoogLeNet, can be used as a starting point. For example, if you have a botany project where flowers need to be classified and limited data is available, you can transfer weights and layers from AlexNet models, which classify over images into 1000 different categories, and replace the final classification layer.

Comparison of training a model from scratch and transfer learning.

The graph below shows the network performance for models with transfer learning and models trained from scratch. With transfer learning, it is possible to achieve a higher model accuracy in a shorter time.

Network performance of training from scratch and transfer learning.

How is transfer learning implemented?

The approach generally follows these process steps:

  1. Load a pretrained network. Select a relevant network that has been trained for a task similar to the new task.
  2. Replace the classification layers for the new task. You may also choose to fine-tune the weights depending on the new task and data available. Generally, the more data you have, the more layers you can chose to fine-tune. With less data, fine-tuning may lead to an overfitted model.
  3. Train the network on the data for the new task.
  4. Test accuracy of the new network.

Transfer learning workflow.

See also: deep learning, convolutional neural networks, GPU Coder