Can we only train the classification layer when do transfer learning of a pre-trained network?

Hi,
The question is, can we only train the classification layer when do transfer learning of a pre-trained network? I want to speed up my training by keeping the feature extraction layers (base model) as they are and only replace and retrain the classification layers.
The equivalent way in Keras (Python) is by: base_model.trainable = False
If possible in Matlab, please let me know how. Your help is appreciated.
Cheers
Sud

2 Comments

See if it is now possible to assign different learning rates to the different layers. I wasn't able to some time ago.
Greg
What we can do it to set the InitialLearnRate to a very small number e.g., 1e-4 and set the WeightLearnRateFactor and BiasLearnRateFactor of the FCNN before the last Classification Layer to a large number (e.g., 20) as was demonstrated here https://uk.mathworks.com/help/deeplearning/gs/get-started-with-transfer-learning.html . This approach seems to approximate the approach you suggested, Greg.
However, I was thinking about something more in the line of what described in https://uk.mathworks.com/help/deeplearning/ug/extract-image-features-using-pretrained-network.html. In this approach, I can record the activations of the last layer (bottleneck features) before the FCNN and use them to train a classifier. But for some reason, I cannot use these features to train an network containing {SequenceInputLayer+FCNN+ClassificationLayer}. Odd.

Sign in to comment.

 Accepted Answer

In order to freeze the weights of a particular layer of your network set the properties WeightLearnRateFactor & BiasLearnRateFactor to zero. Refer to fullyconnectedLayer - Learn Rate and Regularization, convolution2dLayer - Learn Rate and Regularization & lstmLayer - Learn Rate and Regularization.
layer.WeightLearnRateFactor = 0;

2 Comments

Thanks for the answer and providing useful links. They help a lot.
Hi,
I'm OK, it work, but the training seems to be relatively slow? I mean, I expected it to be quicker. With your method, is the gradient calculated for all layers?

Sign in to comment.

More Answers (0)

Products

Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!