Gated recurrent unit
The gated recurrent unit (GRU) operation allows a network to learn dependencies between time steps in time series and sequence data.
Note
This function applies the deep learning GRU operation to dlarray data. If
you want to apply an GRU operation within a layerGraph object
or Layer array, use
the following layer:
applies a gated recurrent unit (GRU) calculation to input dlY = gru(dlX,H0,weights,recurrentWeights,bias)dlX using the
initial hidden state H0, and parameters weights,
recurrentWeights, and bias. The input
dlX is a formatted dlarray with dimension labels.
The output dlY is a formatted dlarray with the same
dimension labels as dlX, except for any 'S'
dimensions.
The gru function updates the hidden state using the hyperbolic
tangent function (tanh) as the state activation function. The gru
function uses the sigmoid function given by as the gate activation function.
[
also returns the hidden state after the GRU operation.dlY,hiddenState] = gru(dlX,H0,weights,recurrentWeights,bias)
[___] = gru(___,'DataFormat',
also specifies the dimension format FMT)FMT when dlX is
not a formatted dlarray. The output dlY is an
unformatted dlarray with the same dimension order as
dlX, except for any 'S' dimensions.
functionToLayerGraph does not support the gru function.
If you use functionToLayerGraph with a function that contains the
gru operation, the resulting LayerGraph contains
placeholder layers.
[1] Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).
dlarray | dlfeval | dlgradient | fullyconnect | lstm | softmax