Potential data dimension mismatch in lstm layer with output mode as 'sequence'?

Question

Liangwu Yan on 11 Jan 2023

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/1892495-potential-data-dimension-mismatch-in-lstm-layer-with-output-mode-as-sequence

Answered: Ben on 16 Mar 2023

From lstmLayer doc page (https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.lstmlayer.html), when the output mode is set as 'sequence'(default), states of every lstm cell (complete sequence) will be output.

When I am reading MATLAB example: Sequence-to-Sequence Regression Using Deep Learning (https://www.mathworks.com/help/deeplearning/ug/sequence-to-sequence-regression-using-deep-learning.html), I am confused at the data dimension between the lstmLayer() and the fullyConnectedlayer() as marked in red rectangle below

My question is, since the sequence length varies (show in the bar plots above), the number of identical lstm cells will be different (RNN definitions). Therefore, for different sequence length, the complete sequence output by lstmLayer() will be different. Following the lstmLayer is a fullyConnectedLayer, that means the size of the weights and bias will change. How could this happen? Moreover, suppose when predicting, a very long sequence comes in, then the complete sequence output by lstm would be extremely long which is not compatible with the weight and bias matrices?

Your answer would be greatly appreaciated, thank you! :).

From a newbie in RNN

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Ben on 16 Mar 2023

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/1892495-potential-data-dimension-mismatch-in-lstm-layer-with-output-mode-as-sequence#answer_1194275

The LSTM and Fully Connected Layer use the same weights and biases for all of the sequence elements. The LSTM works by using it's weights and biases to do 2 things - update the internal states HiddenState and CellState from the previous timestep, and compute the output at the current timestep. In particular it can compute these values using only the values at the current and previous timestep, so it doesn't need to maintain a history of states for every timestep in the sequence.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Potential data dimension mismatch in lstm layer with output mode as 'sequence'?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Potential data dimension mismatch in lstm layer with output mode as 'sequence'?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments