Hi, I am trying to solve a time forecasting problem using LSTM in Matlab. The questions still remain after going through
(Q1) The problem I am facing is in the data preparation stage. Specifically, I have 5000 samples of time responses of the same response quantity and the number of time steps is 1001. I want to train 90% data (5000 x 901) and keep 10% for the prediction (5000 x 100). At present, I am storing the complete data as a matrix:
data is [5000 x 1001]
dataTrain = data(:,901);
dataTest = data(:,901:end);
Then, standardizing the data
XTrain = dataTrainStandardized(:,1:end-1);
YTrain = dataTrainStandardized(:,2:end);
XTest = dataTestStandardized(:,1:end-1);
Now, what should be the LSTM network architecture as per my data set and problem definition?
numFeatures = ? % I guess number of features should be 1 as it is univariate.
numResponses = ? % I guess this should be the number of training time steps (=901)
However, this gives an error “The training sequences are of feature dimension 5000 but the input layer expects sequences of feature dimension 1.” So, should I store the dataset in a cell (each cell representing 1 feature) and inside the cell a matrix of dimension (no of samples x no of time steps)?
numHiddenUnits = 100;
layers = [ ...
(Q2) What does the 'MiniBatchSize' do? Does it divide the time steps (columns) into smaller batches or the number of samples (rows) into smaller batches?
(Q3) The last question is related to the ‘predictAndUpdateState’. Is the following formatting okay?
net = predictAndUpdateState(net,XTrain);
[net,YPred] = predictAndUpdateState(net,YTrain(:,end));
numTimeStepsTest = size(XTest,2); %numel(XTest);
for i = 2:numTimeStepsTest
[net,YPred(:,i)] = predictAndUpdateState(net,YPred(:,i-1),...
This question is somewhat related to Q1.