very good in training very bad in predictions (neural network)
2 views (last 30 days)
Show older comments
Dear all
I buld network with 5*100 input and 1*100 as a target and when I train network the result is very good but when I test it it give me bad result all my data is randomised and normaliz between 1,-1
I tried many network and try different neorn but no change
please help
my code is:
net = newff(p, t, 10, {'logsig', 'purelin'});
net = init(net);
net.divideParam.trainRatio = 75/100;
net.divideParam.testRatio = 15/100;
net.divideParam.valRatio = 10/100;
net.trainParam.epochs = 20;
net.trainParam.goal = 0.000001;
net.trainParam.max_fail = 6;
net.trainParam.lr = 0.06;
net.performParam.regularization = 0.008;
[net tr] = train(net,p,t);
a=sim(net,test);
postreg(a,tt);
0 Comments
Accepted Answer
Greg Heath
on 30 Mar 2015
Edited: Greg Heath
on 9 Aug 2016
newff is obsolete. Use fitnet instead.
I think you wasted too much time on finding parameters. Typically, you should use defaults for everything except
1. Set the initial state of the random generator with your favorite seed. For example
rng('default')
2. Search for the smallest acceptable number of hidden nodes using the outer loop search
h = Hmin:dH:Hmax
3. Search for a suitable combination of random initial weights and train/val/test data divisions using the inner loop search
i = 1:Ntrials
4. Typically, I start with ~ 10 values of h and 10 weight/datadivision trials for each value of h.
5. The documentation example for fitnet (the one for newff is similar)
help fitnet
doc fitnet
[x,t] = simplefit_dataset;
net = fitnet(10);
net = train(net,x,t);
view(net)
y = net(x);
perf = perform(net,t,y)
6. However, in general, my approach yields fewer failures. I have posted scores of examples in the NEWSGROUP and ANSWERS. Try searching with (or newff)
greg fitnet Ntrials
7. Two very important things to understand are that
a. Increasing the number of hidden nodes makes it easier to obtain
a solution. However, the smaller the number of hidden nodes,
the better the net resists noise, interference, measurement
errors and transcription errors. Just as (or more) importantly,
the net performs better on nontraining (validation, test and
unseen) data.
b. Because initial weights and datadivision are random, the design
may fail when all of the other parameters are perfect. Thus the
double loop search.
Hope this helps.
Thank you for formally accepting my answer
Greg
2 Comments
Greg Heath
on 28 Aug 2015
Why are you using non-default hidden layer transfer functions and parameters?
By testing data do you mean the third subset from random data division or an external fourth subset? If the latter, what are the difference sources of your design and testing data? How large are the two datasets
If the sources are different, can you separate them with a classifier (patternnet or newpr)? What happens if you mix that with the design data and then pick the 4 subsets randomly?
Hope this helps.
Greg
More Answers (0)
See Also
Categories
Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!