How can I use a sigmoid output layer in an LSTM or CNN network?

Question

zheng owne on 9 Jul 2018

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/409466-how-can-i-use-a-sigmoid-output-layer-in-an-lstm-or-cnn-network

Edited: Theron FARRELL on 23 Nov 2019

I can use sigmoid transfer function in Deep neural network with setting the net(i).transferfunc = logsig, but I cannot find sigmoid layer in CNN or LSTM Documents. I can only find a fullyconnect layer and regression layer, but they are for linear output not for nonlinear like tanh. If the targets are between 0 and 1, and I want to use CNN to estimate them. What should I do? Thanks!

1 Comment
Show -1 older commentsHide -1 older comments

Abdullah Fahim on 27 Jan 2019

I hoped Matlab would introduce that in 2019a, but apparently they haven't :(.

Sign in to comment.

Sign in to answer this question.

Answer 1

Abdallah Derbalah on 28 Jun 2019

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/409466-how-can-i-use-a-sigmoid-output-layer-in-an-lstm-or-cnn-network#answer_381134

Edited: Abdallah Derbalah on 28 Jun 2019

Open in MATLAB Online

Sigmoid layer is not a standard deep learning layer (up tp 2019a). However, you can code your own custom layer:

classdef sigmoidLayer < nnet.layer.Layer
methods
        function layer = sigmoidLayer(name) 
            % Set layer name
            if nargin == 2
                layer.Name = name;
            end
            % Set layer description
            layer.Description = 'sigmoidLayer'; 
        end
        function Z = predict(layer,X)
            % Forward input data through the layer and output the result
            Z = exp(X)./(exp(X)+1);
        end
        function dLdX = backward(layer, X ,Z,dLdZ,memory)
            % Backward propagate the derivative of the loss function through 
            % the layer 
            dLdX = Z.*(1-Z) .* dLdZ;
        end
    end
end

2 Comments
Show NoneHide None

Theron FARRELL on 21 Nov 2019

Edited: Theron FARRELL on 22 Nov 2019

Open in MATLAB Online

Can such a custom Sigmoid layer, or any custom layer in general, be put into a custom dlnet and use dlgradient?

Furthermore, as I study MATLAB's official example of GAN from https://www.mathworks.com/help/deeplearning/examples/train-generative-adversarial-network.html,

the ganLoss(...) function in fact appends a Sigmoid layer at the end of the discriminator, and the loss is calculated AFTER it.

function [lossGenerator, lossDiscriminator] = ganLoss(dlYPred,dlYPredGenerated)
% Calculate losses for the discriminator network.
lossGenerated = -mean(log(1-sigmoid(dlYPredGenerated)));
lossReal = -mean(log(sigmoid(dlYPred)));
% Combine the losses for the discriminator network.
lossDiscriminator = lossReal + lossGenerated;
% Calculate the loss for the generator network.
lossGenerator = -mean(log(sigmoid(dlYPredGenerated)));
end

And yet when dlgradient(...) is adopted later, it seems to get started from the last layers of the discriminator and the generator, respectively, as it is shown in the example code

% There is NO Sigmoid layer in either of the dlnet
gradientsGenerator = dlgradient(lossGenerator, dlnetGenerator.Learnables,'RetainData',true);
gradientsDiscriminator = dlgradient(lossDiscriminator, dlnetDiscriminator.Learnables);

I am therefore wondering if, as per the chain rule, the loss, shall be firstly subject to the derivative of Sigmoid before it is sent back to the discriminator anc the generator, respectively. Specifically,

% Pseudo code
Final_Loss = -mean(log(sigmoid(dlYPred)));
% For one input
Del(Final_Loss)/Del(dlYRead)
=Del(Final_Loss)/Del(log(sigmoid(dlYPred))) * Del(log(sigmoid(dlYPred)))/Del(dlYRead)
=-(1/sigmoid(dlYPred)) * sigmoid(dlYPred) *(1-sigmoid(dlYPred))
=sigmoid(dlYPred)
% So I reckon that the follwing should be calculated and the last two backpropagated
Loss_G2D = -mean(-sigmoid(dlYPredGen));
Loss_D2D = --mean(1-sigmoid(dlYPredReal));
Loss_D = Loss_D2D + Loss_G2D;
Loss_G = -mean(1-sigmoid(dlYPredGen));

Please do correct me if I am wrong, thanks.

Theron FARRELL on 23 Nov 2019

Edited: Theron FARRELL on 23 Nov 2019

Open in MATLAB Online

Anyway, in AD's answer above, I think it should

if nargin == 1 % Not 2
    layer.Name = name;
end

Otherwise, when being combined with other layers with names, one will be told that this Sigmoid layer is unnamed.

Finally, I am afraid that your implementation of backward() dose not work with an error message

Custom layers with backward functions are not supported.

Instead, if I comment out the backward(), and change the predict() into the built-in MATLAB sigmoid, it works, and MATLAB will perform automatic differentiation for sigmoid.

classdef sigmoidLayer < nnet.layer.Layer
methods
        function layer = sigmoidLayer(name) 
            % Set layer name
            if nargin == 1
                layer.Name = name;
            end
            % Set layer description
            layer.Description = 'sigmoidLayer'; 
        end
        function Z = predict(layer,X)
            % Forward input data through the layer and output the result
            Z = sigmoid(X);
        end
        % No need to define a backward function, as MATLAB supports automatic differentiation of sigmoid
%         function dLdX = backward(layer, X ,Z,dLdZ,memory)
%             % Backward propagate the derivative of the loss function through 
%             % the layer 
%             dLdX = Z.*(1-Z) .* dLdZ;
%         end
    end
end

Sign in to comment.

How can I use a sigmoid output layer in an LSTM or CNN network?

1 Comment
Show -1 older commentsHide -1 older comments

Answers (1)

2 Comments
Show NoneHide None

See Also

Categories

Tags

Products

Community Treasure Hunt

How can I use a sigmoid output layer in an LSTM or CNN network?

1 Comment Show -1 older commentsHide -1 older comments

Answers (1)

2 Comments Show NoneHide None

See Also

Categories

Tags

Products

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments

2 Comments
Show NoneHide None