How do I substitute all the activation functions of a neural network?

Question

Gianluca Maguolo on 11 Jul 2020

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/563414-how-do-i-substitute-all-the-activation-functions-of-a-neural-network

Commented: Joss Knight on 9 Aug 2020

Hi everyone!

I have an out-of-memory issue when substituiting the activation functions ina neural network with other activation functions. This is the code that I use:

while index < totLayers - removedLayers
    if contains(lower(lgraph.Layers(index).Name),'relu')
        name = lgraph.Layers(index).Name;
        conn = lgraph.Connections;
        for i = 1:size(conn,1)
            if strcmp(conn.Source{i},name)
                out = conn.Destination{i};
            elseif strcmp(conn.Destination{i},name)
                in = conn.Source{i};
            end
        end
        channels = findChannels(lgraph,in);
        %create new activation layers
        newActivationLayers = createActivationLayers(newActivations,channels,index+removedLayers,relativeLearnRate,maxInput);
        lgraph = removeLayers(lgraph,name);
        lgraph = addLayers(lgraph,newActivationLayers);
        lgraph = connectLayers(lgraph,in,newActivationLayers(1).Name);
        lgraph = connectLayers(lgraph,newActivationLayers(end).Name,out);
        removedLayers = removedLayers + length(newActivationLayers);
    else
        index = index + 1;
    end
end
plot(lgraph)

findChannels and createActivationLayers only create the new layer to be inserted in the network at that specific point in the network.

The code seems to work because, when I plot lgraph, the output is correct. However, the GPU goes out of memory at training time. I tried to debug my code by substituting every activation in the network with itself (i.e: leaving lgraph unchanged) and a network that I was able to train on my GPU returns me an out-of-memory problem on the network returned by my code.

The only difference that I could see is that the order of the layers in lgraph.Layers is different from the original one and all the activation layers are at the end. However, the graph is correct and I would be surprised if this was the problem.

Does anyone know why I have this issue?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Srivardhan Gadila on 18 Jul 2020

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/563414-how-do-i-substitute-all-the-activation-functions-of-a-neural-network#answer_467619

I would suggest you to try training the network on the cpu. It may be possible that the gpu memory is not sufficient for training the new network.

Refer to 'ExecutionEnvironment' name value pair argument in Hardware Options of trainingOptions and set it to 'cpu'.

If you are able to train the network on cpu successfully then try reducing the batch size while training on gpu.

2 Comments
Show NoneHide None

Gianluca Maguolo on 21 Jul 2020

Hello, thanks for the answer!

However that does not solve the problem. How is that possibile that a network with the same graph does not give me any memory issue using the same training options? Is it possibile that the order of the layers in lgraph.Layers affects the memory requirements? I guess it shouldn't, but I seems to happen.

Srivardhan Gadila on 21 Jul 2020

The following also might help you: Fix Errors in Training, Training with Multiple GPUs.

Sign in to comment.

Answer 2

Joss Knight on 4 Aug 2020

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/563414-how-do-i-substitute-all-the-activation-functions-of-a-neural-network#answer_475180

It looks as though you've replaced every relu layer with multiple other layers. This will make your network deeper. The deeper the network, the more memory you need for training - this is the way backpropagation works, it needs to hold onto all the activations from every layer. In addition, we can only guess at the memory requirements of your extra layers, since you don't say what they are.

I wonder what your extra layers are and why you need more than one new layer to replace something as simple as a relu activation.

2 Comments
Show NoneHide None

Gianluca Maguolo on 6 Aug 2020

Thank yyou very much for you answer. Those could be anything in the future, at the momentI am trying to make this work... The problem is that I had already implemented a naive algorithm to subistute the activation function in a specific newtork, this code should be a generalization. I already thought about what you said, however I tried to figure out if there weere any bugs and I made some simple tests.

I applied the algorithm above to a network that contained a custom activation that I created. That original network worked well. However, after I apply this new algorithm to substitute my custom function with itself (i.e: no changes expected), then the training goes out of memory even with smaller batch sizes than before. When I plot the layerGraph of the two networks, they are exactly the same. The only change is in the order of lgraph.Layers, but that should not affect training. I wonder if I miss something at a lower level. Maybe trainNetwork uses the memory in a way that is not clear to me and it depends on the order of the layers in lgraph.Layers.

Joss Knight on 9 Aug 2020

Did you delete the first network before training the second network? Try calling reset(gpuDevice) before training the modified network.

Sign in to comment.

How do I substitute all the activation functions of a neural network?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (2)

2 Comments
Show NoneHide None

2 Comments
Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How do I substitute all the activation functions of a neural network?

0 Comments Show -2 older commentsHide -2 older comments

Answers (2)

2 Comments Show NoneHide None

2 Comments Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None

2 Comments
Show NoneHide None