MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn moreOpportunities for recent engineering grads.

Apply TodayTo resolve issues starting MATLAB on Mac OS X 10.10 (Yosemite) visit: http://www.mathworks.com/matlabcentral/answers/159016

Asked by Morten on 30 Sep 2011

Hey

I am trying to implement a neural network with leave-one-out crossvalidation. The problem is when I train the network I get a different result each time.

My code is:

-------

hiddenLayerSize = 10;

net = patternnet(hiddenLayerSize);

net.divideFcn = '';

[net] = train(net,inputs,targets);

testOut = net(validation);

[c,cm] = confusion(validationTarget,testOut); %cm

TP = cm(1,1); FN = cm(1,2); TN = cm(2,2); FP = cm(2,1);

fprintf('Sensitivity : %f%%\n', TP/(TP+FN)*100);

fprintf('Specificity : %f%%\n\n', TN/(TN+FP)*100);

-----------

Is it because train() uses different proportions of the input data each time? In this case I have tried to avoid dividing data in training, validation and test by setting net.divideFcn = ''. I have also tried to set net.divideParam.trainRatio = 100/100.

I have tried to set EW = 1, but it does not change anything.

Any suggestions?

Morten

*No products are associated with this question.*

Answer by Pawel Blaszczyk on 30 Sep 2011

Accepted answer

Try to add this command on the beginning of a script:

RandStream.setDefaultStream(RandStream('mt19937ar','seed',1));

Answer by Pawel Blaszczyk on 30 Sep 2011

because your net is preset with random values of gains so during the training you have different start point in each simulation. If you set always the same weights, you will always get the same answer. Function above sets the same seed every time, so the rand() sequence is always identical

Greg Heath on 3 Oct 2011

The only purpose for resetting the RNG with a

previous seed is to reproduce previous results.

It should not be reset during a XVAL experiment.

Resetting with a previous seed (even if the data

partition is different) violates the implicit

assumption of randomness.

Hope this helps.

Greg

Answer by Morten on 30 Sep 2011

Now a new curious thing has occurred: When I run the cross validation with for example two portions of data the result depends on the past training?

This is output from MATLAB:

----

>> nn_test

subjectID =

1

Sensitivity : 85.185185%

Specificity : 93.684211%

subjectID =

2

Sensitivity : 41.176471%

Specificity : 97.549020%

>> nn_test

subjectID =

2

Sensitivity : 23.529412%

Specificity : 97.549020%

-------

In the first execution I validate on subjectID = 1 and train on subject = 2 and in the next loop i validate on subjectID= 2 and train on subjectID = 1.

In the second execution I start validating on subjectID = 2 and train on subjectID = 1, which gives another result than the second loop in the first execution, but it is the same training data and validation data??? I ensure that all variables are cleared before each loop in the crossvalidation. It is also curious that the specificities are the same when the sensitivities differ.

Answer by Greg Heath on 3 Oct 2011

I suspect that similar results are obtained because the same RNG seed is used.

See my previous comments about not resetting the seed.

How large is your data set? I assume your trn/tst split is 50/50,and you are using 2-fold XVAL without a validation set.

See my previous comments on the difference between validation and testing.

Hope this helps.

Greg

Morten on 3 Oct 2011

Since I have set net.divideFcn = '', it is not possible to set net.divideParam.*. But I would like to use input data only as training and not test. I do not understand this test set. I have not seen this use in any other classifiers I have used??

Morten on 3 Oct 2011

I have 10 subjects, so I would like to make a 10-fold xval and in each turn i use the following

---------

hiddenLayerSize = 10;

net = patternnet(hiddenLayerSize);

net.divideFcn = '';

[net] = train(net,inputs,targets);

testOut = net(validation);

---------

"input"s is my 9/10 data and "validation" is 1/10 if the data. Then if I xval 10 times I will get 10 different "testOut". But the problem is if I just do 1 xval with subject 2 as validation and the rest as training I should get the same result ("testOut") as if I xval 10 times and look at the validation with subject 2.. but I do not!??

Answer by faramarz sa on 22 Oct 2013

Edited by faramarz sa on 22 Oct 2013

Different Matlab Neural networks toolbox results is because of two reasons: 1-random data division 2-random weight initialization

For different data division problem use function "divideblock" or "divideint" instead of "dividerand" like this:

net.dividefcn='divideblock; net.divideparam.trainratio=.7; net.divideparam.valratio=.15; net.divideparam.testratio=.15;

For random weight initialization problem, It seems (I'm not sure) all Matlab initialization functions ("initzero", "initlay”, "initwb”, “initnw”) are almost random. So you should force this functions produce similar results per call.

RandStream.setGlobalStream (RandStream ('mrg32k3a','Seed', 1234));

And then use one of them:

net.initFcn='initlay'; net.layers{i}.initFcn='initnw';

## 1 Comment

Direct link to this comment:http://www.mathworks.com/matlabcentral/answers/17118#comment_38342

Terminology:

Data = DesignSet + TestSet

DesignSet = TrainingSet + ValidationSet

DesignSet: Used iteratively to determine final

design parameters (No. of hidden nodes,

No. of epochs, Weight values, etc)

TrainingSet: Used to estimate weights

ValidationSet: Iterative performance estimates used

to select final design parameters.

Generally, final validation performance

is biased because of iterative feedback

between validation and testing.

TestSet: Used once and only once to estimate

unbiased generalization performance (i.e.,

performance on unseen nondesign data).

If TestSet performance is unsatisfactory and additional

designing is desired, Data should be repartitioned to

mitigate feedback biasing.

There are several different ways to use cross validation

(XVAL). The most important principle is that final

performance estimate biasing can be mitigated by using

a test set that was in no way used to determine design

parameters.

Hope this helps.

Greg