Why LSTM training is not done properly?

Hi everyone!
I'm trying to train LSTM with 17521 data points. But with this amount of data, network training is not done properly. When I reduce the data to 8785(data for one year), the training is done to the end. But by increasing the data to 17521(data for two years), I receive NaN value in the YPred variable. This is my code:
YTrain = cell2mat(A_orig(2:17521,end))';
XTrain = cell2mat(A_orig(2:17521,3:end-1))';
XTrain = num2cell(XTrain,1);
YTrain = num2cell(YTrain,1);
%%Define Network Architecture
numResponses = size(YTrain{1},1);
featureDimension = size(XTrain{1},1);
numHiddenUnits = 500;
layers = [ ...
sequenceInputLayer(featureDimension)
lstmLayer(numHiddenUnits,'OutputMode','sequence')
fullyConnectedLayer(500) %%50
dropoutLayer(0.1) %%0.5
fullyConnectedLayer(numResponses)
regressionLayer];
maxepochs = 500;
options = trainingOptions('adam', ... %%adam
'MaxEpochs',maxepochs, ...
'GradientThreshold',1, ...
'InitialLearnRate',0.005, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',125, ...
'LearnRateDropFactor',0.2, ...
'Verbose',0, ...
'Plots','training-progress');
%%Train the Network
net = trainNetwork(XTrain,YTrain,layers,options);
%%Test the Network
YTest = cell2mat(A_orig(17546:26305,end))';
XTest = cell2mat(A_orig(17546:26305,3:end-1))';
XTest = num2cell(XTest,1);
YTest = num2cell(YTest,1);
net = resetState(net);
YPred = predict(net,XTest);
figure;
subplot(2,1,1);
y1 = (cell2mat(YPred(1:end, 1:end))); %have to transpose as plot plots columns
plot(y1);
title('Forcasted');
subplot(2,1,2);
y2 = (cell2mat(YTest(1:end, 1:end))');
plot(y2);
title('Observed');
When Number of data is 8785, the output for training is:
Please help me.Thanks alot.

Answers (1)

I doubt that it is actually the size of the data that is the problem.
You could test that idea by running the second year (which is about the same size dataset as the one that runs) by itself. Perhaps the problem is something in the 2nd-year data, and not the size.
It's difficult to diagnose the problem without seeing the data. Can you upload the data in a MAT file?

7 Comments

Yes. You are right . Second year data is problematic. But I do not know what their problem is. This is my data for second year:
Please help me. Thanks.
I can't really investigate further because I don't have the Deep Learning Toolbox, but one approach to issues like this is to keep "bisecting" the data into halves.
In your case, the first year was OK, but the second year is not. So, next, focus on the second year. Try your code on just the first half (of year 2), and then the second half. Perhaps only one of those two datasets fails.
Keep narrowing it down to a smaller and smaller subset of the data, and it will help you find the problematic section.
I found the problem. The problem with the second set of data was the presence of a value of NaN in the data, which was solved by converting all values of NaN to zero.
I had another question. It takes about an hour to do the training for 8785 data and 2 hours for the whole data. This is too much time for LSTM training. Is there a way to reduce this time?
This is the time spent training 8785 data with LSTM:
Sorry, I have no experience in this.
Ok. Thank you for your time.

Sign in to comment.

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Asked:

on 5 Sep 2021

Commented:

on 6 Sep 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!