Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Define a Custom Classification Output Layer

To construct a classification output layer with cross entropy loss for k mutually exclusive classes, use classificationLayer. If you want to use a different loss function for your classification problems, then you can define a custom classification output layer using this example as a guide. This example shows how to define a custom classification output layer with the sum of squares error (SSE) loss and use it in a convolutional neural network.

To define a custom classification output layer, you can use the template provided in this example, which takes you through the following steps:

  1. Name the layer – Give the layer a name so it can be used in MATLAB®.

  2. Declare the layer properties – Specify the properties of the layer.

  3. Create a constructor function (optional) – Specify how to construct the layer and initialize its properties. If you do not specify a constructor function, then the software initializes the properties with [] at creation.

  4. Create a forward loss function – Specify the loss between the predictions and the training targets.

  5. Create a backward loss function – Specify the derivative of the loss with respect to the predictions.

SSE is an error measure between two continuous random variables. For predictions Y and training targets T, the SSE loss between Y and T is given by

L=i=1K(YiTi)2

where K is the number of observations.

Classification Output Layer Template

Copy the classification output layer template into a new file in MATLAB. This template outlines the structure of a classification output layer and includes the functions that define the layer behavior.

classdef myClassificationLayer < nnet.layer.ClassificationLayer
        
    properties
        % (Optional) Layer properties

        % Layer properties go here
    end
 
    methods
        function layer = myClassificationLayer()           
            % (Optional) Create a myClassificationLayer

            % Layer constructor function goes here
        end

        function loss = forwardLoss(layer, Y, T)
            % Return the loss between the predictions Y and the 
            % training targets T
            %
            % Inputs:
            %         layer - Output layer
            %         Y     – Predictions made by network
            %         T     – Training targets
            %
            % Output:
            %         loss  - Loss between Y and T

            % Layer forward loss function goes here
        end
        
        function dLdY = backwardLoss(layer, Y, T)
            % Backward propagate the derivative of the loss function
            %
            % Inputs:
            %         layer - Output layer
            %         Y     – Predictions made by network
            %         T     – Training targets
            %
            % Output:
            %         dLdY  - Derivative of the loss with respect to the predictions Y

            % Layer backward loss function goes here
        end
    end
end

Name the Layer

First, give the layer a name. In the first line of the class file, replace the existing name myClassificationLayer with exampleClassificationSSELayer.

classdef exampleClassificationSSELayer < nnet.layer.ClassificationLayer
    ...
end

Next, rename the myClassificationLayer constructor function (the first function in the methods section) to have the same name, and update the header comment.

    methods
        function layer = exampleClassificationSSELayer()           
            % Create an exampleClassificationSSELayer

            % Layer constructor function goes here
        end

        ...
     end

Save the Layer

Save the layer class file in a new file named exampleClassificationSSELayer.m. The file name must match the layer name. To use the layer, you must save the file in the current folder or in a folder on the MATLAB path.

Declare Layer Properties

Declare the layer properties in the properties section.

By default, user-defined layers have three properties:

  • Name – Name of the layer, specified as a character vector. Use the Name property to identify and index layers in a network. If you do not set the layer name, then the software automatically assigns one at training time.

  • Description – One-line description of the layer, specified as a character vector. This description appears when the layer is displayed in a Layer array. The default value is the layer class name.

  • Type – Type of the layer, specified as a character vector. The value of Type appears when the layer is displayed in a Layer array. The default value is the layer class name.

If the layer has no other properties, then you can omit the properties section.

In this example, the layer does not require any additional properties, so you can remove the properties section.

Create Constructor Function

Create the function that constructs the layer and initializes the layer properties. Specify any variables required to create the layer as inputs to the constructor function.

Specify an optional input argument name to assign to the Name property at creation.

function layer = exampleClassificationSSELayer(name)
    % Create an exampleClassificationSSELayer

    % Layer constructor function goes here
end

Initialize Layer Properties

Replace the comment % Layer constructor function goes here with code that initializes the layer properties.

Give the layer a one-line description by setting the Description property of the layer. Set the Name property to the optional input argument name.

Set the description to describe the type of layer and its size.

        function layer = exampleClassificationSSELayer(name)
            % Create an exampleClassificationSSELayer
    
            % Set layer name
            if nargin == 1
                layer.Name = name;
            end

            % Set layer description
            layer.Description = 'Example classification layer with SSE loss';
        end

Create Forward Loss Function

Create a function named forwardLoss that returns the SSE loss between the predictions made by the network and the training targets. The syntax for forwardLoss is loss = forwardLoss(layer, Y, T), where Y is the output of the previous layer and T represents the training targets.

For classification problems, the dimensions of T depend on the type of problem. The following table describes the dimensions of T.

TaskDimensions of T
Image classification4-D array of size 1-by-1-by-K-by-N, where K is the number of classes, and N is the mini-batch size.
Sequence-to-label classificationMatrix of size K-by-N, where K is the number of classes, and N is the mini-batch size.
Sequence-to-sequence classification3-D array of size K-by-N-by-S, where K is the number of classes, N is the mini-batch size, and S is the sequence length.

The size of Y depends on the output of the previous layer. To ensure that Y is the same size as T, you must include a layer that outputs the correct size before the output layer. For example, to ensure that Y is a 4-D array of prediction scores for K classes, you can include a fully connected layer of size K followed by a softmax layer before the output layer.

For prediction scores Y and training targets T, the SSE loss between Y and T is given by

L=i=1K(YiTi)2

where K is the number of classes.

The inputs Y and T correspond to Y and T in the equation, respectively. The output loss corresponds to L. To ensure that loss is scalar, output the mean loss over the mini-batch.

        function loss = forwardLoss(layer, Y, T)
            % Returns the SSE loss between the predictions Y and the
            % training targets T

            % Calculate sum of squares
            sumSquares = sum((Y-T).^2);
    
            % Take mean over mini-batch
            N = size(Y,4);
            loss = sum(sumSquares)/N;
        end

Create Backward Loss Function

Create the backward loss function.

Create a function named backwardLoss that returns the derivatives of the SSE loss with respect to the predictions Y. The syntax for backwardLoss is loss = backwardLoss(layer, Y, T), where Y is the output of the previous layer and T represents the training targets.

The dimensions of Y and T are the same as the inputs in forwardLoss.

The derivative of the SSE loss with respect to the predictions Y is given by

δLδYi=2N(YiTi)

where N is the size of the mini-batch.

        function dLdY = backwardLoss(layer, Y, T)
            % Returns the derivatives of the SSE loss with respect to the predictions Y

            N = size(Y,4);
            dLdY = 2*(Y-T)/N;
        end

Completed Layer

View the completed classification output layer class file.

classdef exampleClassificationSSELayer < nnet.layer.ClassificationLayer
               
    methods
        function layer = exampleClassificationSSELayer(name)
            % Create an exampleClassificationSSELayer
    
            % Set layer name
            if nargin == 1
                layer.Name = name;
            end

            % Set layer description
            layer.Description = 'Example classification layer with SSE loss';
        end
        
        function loss = forwardLoss(layer, Y, T)
            % Returns the SSE loss between the predictions Y and the
            % training targets T

            % Calculate sum of squares
            sumSquares = sum((Y-T).^2);
    
            % Take mean over mini-batch
            N = size(Y,4);
            loss = sum(sumSquares)/N;
        end
        
        function dLdY = backwardLoss(layer, Y, T)
            % Returns the derivatives of the SSE loss with respect to the predictions Y

            N = size(Y,4);
            dLdY = 2*(Y-T)/N;
        end
    end
end

GPU Compatibility

For GPU compatibility, the layer functions must support inputs and return outputs of type gpuArray. Any other functions used by the layer must do the same. Many MATLAB built-in functions support gpuArray input arguments. If you call any of these functions with at least one gpuArray as an input, then the function executes on the GPU and returns a gpuArray. For a list of functions that execute on a GPU, see Run Built-In Functions on a GPU (Parallel Computing Toolbox). To use a GPU for deep learning, you must also have a CUDA® enabled NVIDIA® GPU with compute capability 3.0 or higher. For more information on working with GPUs in MATLAB, see GPU Computing in MATLAB (Parallel Computing Toolbox).

Include Custom Classification Output Layer in Network

You can use a custom output layer in the same way as any other output layer in Neural Network Toolbox. This section shows how to create and train a network for classification using the custom classification output layer you created earlier.

Load the example training data.

[XTrain, YTrain] = digitTrain4DArrayData;

Create a layer array including the custom classification output layer exampleClassificationSSELayer.

layers = [ ...
    imageInputLayer([28 28 1])
    convolution2dLayer(5,20)
    batchNormalizationLayer
    reluLayer
    fullyConnectedLayer(10)
    softmaxLayer
    exampleClassificationSSELayer]
layers = 
  7x1 Layer array with layers:

     1   ''   Image Input            28x28x1 images with 'zerocenter' normalization
     2   ''   Convolution            20 5x5 convolutions with stride [1  1] and padding [0  0  0  0]
     3   ''   Batch Normalization    Batch normalization
     4   ''   ReLU                   ReLU
     5   ''   Fully Connected        10 fully connected layer
     6   ''   Softmax                softmax
     7   ''   Classification layer   Example classification layer with SSE loss

Set the training options and train the network.

options = trainingOptions('sgdm');
net = trainNetwork(XTrain,YTrain,layers,options);
Training on single CPU.
Initializing image normalization.
|========================================================================================|
|  Epoch  |  Iteration  |  Time Elapsed  |  Mini-batch  |  Mini-batch  |  Base Learning  |
|         |             |   (hh:mm:ss)   |   Accuracy   |     Loss     |      Rate       |
|========================================================================================|
|       1 |           1 |       00:00:00 |       14.84% |       0.8972 |          0.0100 |
|       2 |          50 |       00:00:05 |       75.00% |       0.3219 |          0.0100 |
|       3 |         100 |       00:00:10 |       92.97% |       0.1306 |          0.0100 |
|       4 |         150 |       00:00:16 |       94.53% |       0.0919 |          0.0100 |
|       6 |         200 |       00:00:23 |       97.66% |       0.0606 |          0.0100 |
|       7 |         250 |       00:00:30 |       97.66% |       0.0493 |          0.0100 |
|       8 |         300 |       00:00:37 |      100.00% |       0.0083 |          0.0100 |
|       9 |         350 |       00:00:44 |      100.00% |       0.0136 |          0.0100 |
|      11 |         400 |       00:00:51 |       99.22% |       0.0187 |          0.0100 |
|      12 |         450 |       00:00:58 |      100.00% |       0.0060 |          0.0100 |
|      13 |         500 |       00:01:04 |       99.22% |       0.0130 |          0.0100 |
|      15 |         550 |       00:01:10 |      100.00% |       0.0046 |          0.0100 |
|      16 |         600 |       00:01:16 |       99.22% |       0.0132 |          0.0100 |
|      17 |         650 |       00:01:23 |      100.00% |       0.0032 |          0.0100 |
|      18 |         700 |       00:01:30 |       99.22% |       0.0136 |          0.0100 |
|      20 |         750 |       00:01:37 |       99.22% |       0.0131 |          0.0100 |
|      21 |         800 |       00:01:45 |       99.22% |       0.0104 |          0.0100 |
|      22 |         850 |       00:01:52 |      100.00% |       0.0018 |          0.0100 |
|      24 |         900 |       00:02:00 |      100.00% |       0.0017 |          0.0100 |
|      25 |         950 |       00:02:08 |      100.00% |       0.0016 |          0.0100 |
|      26 |        1000 |       00:02:17 |      100.00% |       0.0008 |          0.0100 |
|      27 |        1050 |       00:02:25 |      100.00% |       0.0010 |          0.0100 |
|      29 |        1100 |       00:02:33 |      100.00% |       0.0012 |          0.0100 |
|      30 |        1150 |       00:02:41 |      100.00% |       0.0010 |          0.0100 |
|      30 |        1170 |       00:02:44 |      100.00% |       0.0009 |          0.0100 |
|========================================================================================|

Evaluate the network performance by making predictions on new data and calculating the accuracy.

[XTest, YTest] = digitTest4DArrayData;
YPred = classify(net, XTest);
accuracy = sum(YTest == YPred)/numel(YTest)
accuracy = 0.9856

See Also

Related Topics

Was this topic helpful?