How do I substitute all the activation functions of a neural network?

2 views (last 30 days)
Gianluca Maguolo
Gianluca Maguolo on 11 Jul 2020
Commented: Joss Knight on 9 Aug 2020
Hi everyone!
I have an out-of-memory issue when substituiting the activation functions ina neural network with other activation functions. This is the code that I use:
while index < totLayers - removedLayers
if contains(lower(lgraph.Layers(index).Name),'relu')
name = lgraph.Layers(index).Name;
conn = lgraph.Connections;
for i = 1:size(conn,1)
if strcmp(conn.Source{i},name)
out = conn.Destination{i};
elseif strcmp(conn.Destination{i},name)
in = conn.Source{i};
channels = findChannels(lgraph,in);
%create new activation layers
newActivationLayers = createActivationLayers(newActivations,channels,index+removedLayers,relativeLearnRate,maxInput);
lgraph = removeLayers(lgraph,name);
lgraph = addLayers(lgraph,newActivationLayers);
lgraph = connectLayers(lgraph,in,newActivationLayers(1).Name);
lgraph = connectLayers(lgraph,newActivationLayers(end).Name,out);
removedLayers = removedLayers + length(newActivationLayers);
index = index + 1;
findChannels and createActivationLayers only create the new layer to be inserted in the network at that specific point in the network.
The code seems to work because, when I plot lgraph, the output is correct. However, the GPU goes out of memory at training time. I tried to debug my code by substituting every activation in the network with itself (i.e: leaving lgraph unchanged) and a network that I was able to train on my GPU returns me an out-of-memory problem on the network returned by my code.
The only difference that I could see is that the order of the layers in lgraph.Layers is different from the original one and all the activation layers are at the end. However, the graph is correct and I would be surprised if this was the problem.
Does anyone know why I have this issue?

Answers (2)

Srivardhan Gadila
Srivardhan Gadila on 18 Jul 2020
I would suggest you to try training the network on the cpu. It may be possible that the gpu memory is not sufficient for training the new network.
Refer to 'ExecutionEnvironment' name value pair argument in Hardware Options of trainingOptions and set it to 'cpu'.
If you are able to train the network on cpu successfully then try reducing the batch size while training on gpu.

Sign in to comment.

Joss Knight
Joss Knight on 4 Aug 2020
It looks as though you've replaced every relu layer with multiple other layers. This will make your network deeper. The deeper the network, the more memory you need for training - this is the way backpropagation works, it needs to hold onto all the activations from every layer. In addition, we can only guess at the memory requirements of your extra layers, since you don't say what they are.
I wonder what your extra layers are and why you need more than one new layer to replace something as simple as a relu activation.
Joss Knight
Joss Knight on 9 Aug 2020
Did you delete the first network before training the second network? Try calling reset(gpuDevice) before training the modified network.

Sign in to comment.




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!