Neural networks - How to use different datasets for training, validation and testing?

11 views (last 30 days)
Best
I've a question about Neural networks in Matlab.
First of all, I've a small NN, 2 inputs, 1 hidden layer with 10 neurons and one output. And this works fine. But the question which I've is. Can I determine my training date, validation data and test data?
I know, if I use e.g. net = feedforwardnet(10); that I can divide my overall dataset into e.g.70/100 15/100 and 15/100. But I don't want to do this, because in this case I want to train my NN with a 1000 data-points, validate them with another 1000 data-points and use another independent data-set of 1000 data-points to test them. With other words, I want to control these 3 interdependent data-sets.
Thus, can someone help me?
Kind regards
Edit, I don't want to use a data-set with 3000 data-points and set the devideParams on 1/3 1/3 & 1/3.

Accepted Answer

Greg Heath
Greg Heath on 5 Apr 2015
Please clarify because multiple 1/3,1/3,1/3 designs for each trial value of H, the number of hidden nodes, is exactly the best way to approach the problem.
Clarification is needed because what you are asking for makes absolutely no sense to me.
If you want the data sets to be independent, you need 3000 examples.
The design set = trainingset + validationset must be used together. However, the validation set doesn't have to be nearly as large.
The testset can be used separately after the validation set is used to pick out the top contenders for the "best" nets.
However, there is no need to keep the test set separate because the training algorithm handles that automatically.
Anyway, the proposed 1/3,1/3,0 can be implemented. However, there is no good reason for it... or at least I cannot think of one.
Thank you for formally accepting my answer
Greg

More Answers (1)

SP
SP on 8 Dec 2016
Edited: SP on 8 Dec 2016
I understand your frustration, and it's a problem I'm currently facing as well. The following is my current solution:
First, when training, train fully on the dataset you want:
net.divideParam.trainRatio = 100/100;
net.divideParam.valRatio = 0/100;
net.divideParam.testRatio = 0/100;
your training cycle should automatically stop either depending on # of iterations (default is 1000) or whenever your gradient reaches a certain level (i.e. 1.00e-06)
Second, use
genFunction(net,pathname)
or something else to generate a function for your net.
Lastly, Feed your test dataset into the function and calculate the accuracy after.
Hope this helps.
Cheers!

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!