NEURAL NETWORK Generalization Problems: 95% Good Prediction, 5% Bad Prediction

1 view (last 30 days)
Hello, I am using neural network as a forecasting tool to forecast daily 24 hours electricity load. When there is no occupancy, load is zero during morning and night, however, when there is occupancy, load is high and changing.
So, for each day test, I select the similar database, then, train, validate and test. For example, if I am forecasting for test day 5, then I search similar database from my historical database. With these historical database, I use cross-validation. I choose mean square error as performance function for "trainlm" default function in matlab and i select best network from cross-validation by summing performance of all these cross validation and choose with less RMSE and R2(coefficient of determination) from all these validation sets to select hidden neurons and initialization.
However, while choosing this neurons and initialization to the test set, there is 95% prediction good result for 365 day data and 5% Bad Prediction. In bad prediction, what I observed is that there must energy load zero in morning and night (in actual energy load), however, during prediction, there is some offset value around 50. So, what I observe is that neural network memorizes the training set, but during generalization it is very bad.
I use this value of MATLAB: net.trainParam.mu=1,net.trainParam.mu_dec=0.8, net.trainParam.mu_inc=1.5, net.trainParam.max_fail=6 and other default values from "trainlm" function for each day prediction.
Just to make clear that, for each day prediction, i search best hidden neurons and random initialization weight. In my case, random initialization weight =10 trials.
I would appreciate your suggestion.
Thanks.

Accepted Answer

Greg Heath
Greg Heath on 5 Aug 2014
Edited: Greg Heath on 6 Aug 2014
I don't understand the need for daily creations.
Just create a net for each database and store it.
Overfitting is easily handled by reducing the number of hidden nodes.
If that is unsatisfactory, use as many nodes as you need but in order to prevent overtraining the overfit net use either or both
a. a validation set
b. regularization
You had better check to see if MATLAB allows them to be used together. If it doesn't, it should and MATLAB should be encouraged to make that option available.
Hope this helps
Thank you for formally accepting my answer
Greg

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!