Main Content

embeddingConcatenationLayer

Embedding concatenation layer

Since R2023b

    Description

    An embedding concatenation layer combines its input and an embedding vector by concatenation.

    Creation

    Description

    example

    layer = embeddingConcatenationLayer creates an embedding concatenation layer.

    example

    layer = embeddingConcatenationLayer(Name=Value) creates an embedding concatenation layer and sets the Parameters and Initialization and Name properties using one or more name-value arguments.

    Properties

    expand all

    Parameters and Initialization

    Function to initialize the weights, specified as one of these values:

    • "narrow-normal" — Initialize the weights by independently sampling from a normal distribution with zero mean and a standard deviation of 0.01.

    • "glorot" — Initialize the weights with the Glorot initializer [1] (also known as Xavier initializer). The Glorot initializer independently samples from a uniform distribution with zero mean and a variance of 2/(numIn + numOut), where numIn and numOut are the number of channels in the layer input, respectively.

    • "he" — Initialize the weights with the He initializer [2]. The He initializer samples from a normal distribution with zero mean and a variance of 2/numIn, where numIn is the number of channels in the layer input.

    • "zeros" — Initialize the weights with zeros.

    • "ones" — Initialize the weights with ones.

    • Function handle – Initialize the weights with a custom function. If you specify a function handle, then the function must have the form weights = func(sz), where sz is the size of the weights.

    The layer initializes the weights only when the Weights property is empty.

    Learnable weights, specified as a numeric column vector of length numChannels or [].

    The layer weights are learnable parameters. You can specify the initial value of the weights directly using the Weights property of the layer. When you train a network, if the Weights property of the layer is nonempty, then the trainnet and trainNetwork functions use the Weights property as the initial value. If the Weights property is empty, then the software uses the initializer specified by the WeightsInitializer property of the layer.

    Data Types: single | double

    Layer

    Layer name, specified as a character vector or a string scalar. For Layer array input, the trainnet and dlnetwork functions automatically assign names to layers with the name "".

    The EmbeddingConcatenationLayer object stores this property as a character vector.

    Data Types: char | string

    This property is read-only.

    Number of inputs to the layer, returned as 1. This layer accepts a single input only.

    Data Types: double

    This property is read-only.

    Input names, returned as {'in'}. This layer accepts a single input only.

    Data Types: cell

    This property is read-only.

    Number of outputs from the layer, returned as 1. This layer has a single output only.

    Data Types: double

    This property is read-only.

    Output names, returned as {'out'}. This layer has a single output only.

    Data Types: cell

    Examples

    collapse all

    Create an embedding concatenation layer.

    layer = embeddingConcatenationLayer
    layer = 
      EmbeddingConcatenationLayer with properties:
    
                         Name: ''
                    InputSize: 'auto'
           WeightsInitializer: 'narrow-normal'
        WeightLearnRateFactor: 1
               WeightL2Factor: 1
    
       Learnable Parameters
                      Weights: []
    
       State Parameters
        No properties.
    
    Use properties method to see a list of all properties.
    
    

    Include an embedding concatenation layer in a neural network.

    net = dlnetwork;
    
    numChannels = 1;
    
    embeddingOutputSize = 64;
    numWords = 128;
    
    maxSequenceLength = 100;
    maxPosition = maxSequenceLength+1;
    
    numHeads = 4;
    numKeyChannels = 4*embeddingOutputSize;
    
    layers = [ 
        sequenceInputLayer(numChannels)
        wordEmbeddingLayer(embeddingOutputSize,numWords,Name="word-emb")
        embeddingConcatenationLayer(Name="emb-cat")
        positionEmbeddingLayer(embeddingOutputSize,maxPosition,Name="pos-emb");
        additionLayer(2,Name="add")
        selfAttentionLayer(numHeads,numKeyChannels,AttentionMask="causal")
        fullyConnectedLayer(numWords)
        softmaxLayer];
    net = addLayers(net,layers);
    
    net = connectLayers(net,"emb-cat","add/in2");

    View the neural network architecture.

    plot(net)
    axis off
    box off

    Algorithms

    expand all

    References

    [1] Glorot, Xavier, and Yoshua Bengio. "Understanding the Difficulty of Training Deep Feedforward Neural Networks." In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 249–356. Sardinia, Italy: AISTATS, 2010. https://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf

    [2] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification." In 2015 IEEE International Conference on Computer Vision (ICCV), 1026–34. Santiago, Chile: IEEE, 2015. https://doi.org/10.1109/ICCV.2015.123

    Extended Capabilities

    Version History

    Introduced in R2023b