Why does training perfomance change when a validation set is considered?

2 views (last 30 days)
Hello!
For example, I considered the input and output:
input=1:1:10 output=[1:2:15 24 24]
and then I try 3 different options:
OPTION 1 rand('twister',1) net = feedforwardnet(4); net.trainParam.epochs =3; net.divideFcn='divideind'; [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(10,1:10); [net,tr,Y1,E1] = train(net,input,output);
OPTION 2 rand('twister',1) net = feedforwardnet(4); net.trainParam.epochs =3; net.divideFcn='divideind'; [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(10,1:8,9:10); %net.divideParam.trainRatio=1;net.divideParam.valRatio=0;net.divideParam.testRatio=0; [net,tr,Y1,E1] = train(net,input,output);
OPTION 3 rand('twister',1) net = feedforwardnet(4); net.trainParam.epochs =3; net.divideFcn='divideind'; [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(8,1:8); [net,tr,Y1,E1] = train(net,input(:,1:8),output(:,1:8));
The initialisations are similar, the all 3 options stopped because they reached the maximum epoch. I checked epoch=0 and the weights and bias are similar but the (training) performance isn't. And from epoch=0, everything is different when comparing the 3 options. If I don't change divideFcn and I consider the same experiments as before, using the same indices for training, I have the same problem. So it isn't because of divideind! I'd like to understand why this is happening. I checked the functions step by step. Could anyone help me? Thank you very much. Ana
  1 Comment
Greg Heath
Greg Heath on 5 Oct 2012
I took a prelimiary look. Something subtle is going on.
1. Option 1 is irrelevant.
2. I chose Nepochs = 1 and and rng(0) initialization.
3. The final weights for Options 2 & 3 are different (They shouldn't be).
I'll be baahk.
Aahnold.

Sign in to comment.

Accepted Answer

Greg Heath
Greg Heath on 28 Nov 2012
The difference in the last two results was completely caused by using
1) ... = train(net,input(:,1:8),output(:,1:8));
instead of
2) ... = train(net,input,output);
Verification: For each of these 2 syntaxes I ran 3 trials for one epoch with
a. divideind(10,1:8,9:10);
b. divideind(10,1:8);
c. divideind(8,1:8);
For each syntax the 3 trials yielded identical results.
The reason why probably lies in the code of train:
type train
Hope this helps.
Thank you for officially accepting my answer.
Greg

More Answers (1)

Zeeshan
Zeeshan on 27 Nov 2012
Hi,
I think because the data is divided randomly to check for validation of model, therefore some network may get trained better than the other because it was trained on a different set of data (randomly chosen training data).
I am also working on a comparison of architectures and I am going to fix the time points for each dataset for training and validation to compare them.
Regards,
Shan

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!