Main Content

gru

Gated recurrent unit

Description

The gated recurrent unit (GRU) operation allows a network to learn dependencies between time steps in time series and sequence data.

Note

This function applies the deep learning GRU operation to dlarray data. If you want to apply an GRU operation within a layerGraph object or Layer array, use the following layer:

example

Y = gru(X,H0,weights,recurrentWeights,bias) applies a gated recurrent unit (GRU) calculation to input X using the initial hidden state H0, and parameters weights, recurrentWeights, and bias. The input X must be a formatted dlarray. The output Y is a formatted dlarray with the same dimension format as X, except for any 'S' dimensions.

The gru function updates the hidden state using the hyperbolic tangent function (tanh) as the state activation function. The gru function uses the sigmoid function given by σ(x)=(1+ex)1 as the gate activation function.

[Y,hiddenState] = gru(X,H0,weights,recurrentWeights,bias) also returns the hidden state after the GRU operation.

[___] = gru(___,'DataFormat',FMT) also specifies the dimension format FMT when X is not a formatted dlarray. The output Y is an unformatted dlarray with the same dimension order as X, except for any 'S' dimensions.

Examples

collapse all

Perform a GRU operation using 100 hidden units.

Create the input sequence data as 32 observations with ten channels and a sequence length of 64.

numFeatures = 10;
numObservations = 32;
sequenceLength = 64;

X = randn(numFeatures,numObservations,sequenceLength);
dlX = dlarray(X,'CBT');

Create the initial hidden state with 100 hidden units. Use the same initial hidden state for all observations.

numHiddenUnits = 100;
H0 = zeros(numHiddenUnits,1);

Create the learnable parameters for the GRU operation.

weights = dlarray(randn(3*numHiddenUnits,numFeatures));
recurrentWeights = dlarray(randn(3*numHiddenUnits,numHiddenUnits));
bias = dlarray(randn(3*numHiddenUnits,1));

Perform the GRU calculation.

[dlY,hiddenState] = gru(dlX,H0,weights,recurrentWeights,bias);

View the size and dimension format of dlY.

size(dlY)
ans = 1×3

   100    32    64

dlY.dims
ans = 
'CBT'

View the size of hiddenState.

size(hiddenState)
ans = 1×2

   100    32

You can use the hidden state to keep track of the state of the GRU operation and input further sequential data.

Input Arguments

collapse all

Input data, specified as a formatted dlarray, an unformatted dlarray, or a numeric array. When X is not a formatted dlarray, you must specify the dimension label format using 'DataFormat',FMT. If X is a numeric array, at least one of H0, weights, recurrentWeights, or bias must be a dlarray.

X must contain a sequence dimension labeled 'T'. If X has any spatial dimensions labeled 'S', they are flattened into the 'C' channel dimension. If X does not have a channel dimension, then one is added. If X has any unspecified dimensions labeled 'U', they must be singleton.

Data Types: single | double

Initial hidden state vector, specified as a formatted dlarray, an unformatted dlarray, or a numeric array.

If H0 is a formatted dlarray, it must contain a channel dimension labeled 'C' and optionally a batch dimension labeled 'B' with the same size as the 'B' dimension of X. If H0 does not have a 'B' dimension, the function uses the same hidden state vector for each observation in X.

If H0 is a formatted dlarray, then the size of the 'C' dimension determines the number of hidden units. Otherwise, the size of the first dimension determines the number of hidden units.

Data Types: single | double

Weights, specified as a formatted dlarray, an unformatted dlarray, or a numeric array.

Specify weights as a matrix of size 3*NumHiddenUnits-by-InputSize, where NumHiddenUnits is the size of the 'C' dimension of H0, and InputSize is the size of the 'C' dimension of X multiplied by the size of each 'S' dimension of X, where present.

If weights is a formatted dlarray, it must contain a 'C' dimension of size 3*NumHiddenUnits and a 'U' dimension of size InputSize.

Data Types: single | double

Recurrent weights, specified as a formatted dlarray, an unformatted dlarray, or a numeric array.

Specify recurrentWeights as a matrix of size 3*NumHiddenUnits-by-NumHiddenUnits, where NumHiddenUnits is the size of the 'C' dimension of H0.

If recurrentWeights is a formatted dlarray, it must contain a 'C' dimension of size 3*NumHiddenUnits and a 'U' dimension of size NumHiddenUnits.

Data Types: single | double

Bias, specified as a formatted dlarray, an unformatted dlarray, or a numeric array.

Specify bias as a vector of length 3*NumHiddenUnits, where NumHiddenUnits is the size of the 'C' dimension of H0.

If bias is a formatted dlarray, the nonsingleton dimension must be labeled with 'C'.

Data Types: single | double

Dimension order of unformatted input data, specified as the comma-separated pair consisting of 'DataFormat' and a character array or string FMT that provides a label for each dimension of the data. Each character in FMT must be one of the following:

  • 'S' — Spatial

  • 'C' — Channel

  • 'B' — Batch (for example, samples and observations)

  • 'T' — Time (for example, sequences)

  • 'U' — Unspecified

You can specify multiple dimensions labeled 'S' or 'U'. You can use the labels 'C', 'B', and 'T' at most once.

You must specify 'DataFormat',FMT when the input data is not a formatted dlarray.

Example: 'DataFormat','SSCB'

Data Types: char | string

Output Arguments

collapse all

GRU output, returned as a dlarray. The output Y has the same underlying data type as the input X.

If the input data X is a formatted dlarray, Y has the same dimension format as X, except for any 'S' dimensions. If the input data is not a formatted dlarray, Y is an unformatted dlarray with the same dimension order as the input data.

The size of the 'C' dimension of Y is the same as the number of hidden units, specified by the size of the 'C' dimension of H0.

Hidden state vector for each observation, returned as a dlarray or a numeric array with the same data type as H0.

If the input H0 is a formatted dlarray, then the output hiddenState is a formatted dlarray with the format 'CB'.

Limitations

  • functionToLayerGraph does not support the gru function. If you use functionToLayerGraph with a function that contains the gru operation, the resulting LayerGraph contains placeholder layers.

More About

collapse all

Gated Recurrent Unit

The GRU operation allows a network to learn dependencies between time steps in time series and sequence data. For more information, see the Gated Recurrent Unit Layer definition on the gruLayer reference page.

References

[1] Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).

Extended Capabilities

Version History

Introduced in R2020a