Custom performance vectors for neural network training

Question

Harold on 8 May 2013

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/75098-custom-performance-vectors-for-neural-network-training

I'm working on pattern recognition using MATLAB's built in neural network toolbox. I've used this toolbox to generate code. I've successfully implemented this in a working gui. The problem that I am trying to solve now is to let the user select vectors for validation and testing from a file. For example, I'm training the network to recognize 4 letters "ABCD". I've been reading in documentation that validation samples are used to measure network generalization; i.e found out how my network would perform on data it has never seen before. There's also testing samples which are used to give an independent measure of network performance during and after training; used to determing when to stop training.

I would still like to use these. A work around to this is to combine my training, validation, and testing vectors into one matrix. I then use this as my training matrix and use the code below to separate the vectors back out. Train, Val, and Test can be determined by using size() for each vector (the original training vector, validation, and testing). The matrix data contains the original training vector, validation vector, and testing vector column-wise.

% Setup Division of Data for Training, Validation, Testing
% For a list of all data division functions type: help nndivide
net.divideFcn = 'dividerand';  % Divide data randomly
net.divideMode = 'sample';  % Divide up every sample
net.divideParam.trainRatio = Train/size(data,2);
net.divideParam.valRatio = Val/size(data,2);
net.divideParam.testRatio = Test/size(data,2);

The one problem that I see with this is decimal ratios. For example, let the size of the original training vector, the validation vector, and the testing vector all be 10x1 (the rows contain the numbers for the network, the columns are the number of sample sets). This would mean that the trainRatio, valRatio, and testRatio would all be 3.333333333333333e-01. I'm not sure if MATLAB will split data up into three parts without throwing an error because of the decimal.

Any thoughts on this or work arounds?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Greg Heath on 10 May 2013

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/75098-custom-performance-vectors-for-neural-network-training#answer_84980

Edited: Greg Heath on 10 May 2013

% I've been reading in documentation that validation samples are used to measure network generalization; i.e found out how my network would perform on data it has never seen before.

NO. See Below.

%There's also testing samples which are used to give an independent measure of network performance during and after training; used to determing when to stop training.

1. total = design + test

2. design = training + validation

3. training:

 a. used to obtain weight values given training parameters.
 b. training error estimates tend to be extremely biased as the number 
    of unknown weights, Nw, increases toward the number of training 
    equations, Ntrneq.
 c. Ndof = Ntrneq-Nw is the number of estimation degrees of freedom
   (See Wikipedia). As long as Ndof  is sufficiently positive, the bias 
   of estimating error with training data can be mitigated, somewhat,
   by using the degree of freedom adjustment of dividing SSEtrn by Ndof
   instead of Ntrneq.

4. validation:

 a. used repeatedly with the training set to determine a good  set of training parameters (especially the stopping epoch) via choosing the best of multiple random initial weight designs.
 b. Validation set error tends to be much less biased than training set error, especially if training doesn't stop because of validation error convergence.

5. test:

 a.used once, and only once to obtain an unbiased error estimate of          nontraining data.
 b. if performance is unsatisfactory and more designs are necessary, the  data should be repatitioned into new tr/val/tst subsets.

%I would still like to use these. A work around to this is to combine my training, validation, and testing vectors into one matrix.

No. This not a work around. 'dividerand' is the default.

MATLAB uses

 Ntst = round(tstratio*N)
 Nval = round(valratio*N)
 Ntrn = N - Nval - Ntst.

The training record, tr, contains the indices for each subset.

[ net tr Y E ] = train(net,x,t);

Y is the output and E is the error E = t-Y.

If you want , you can use dividerand anytime before training to obtain the indices then assign the indices to the net by using divideind.

Hope this helps.

*Thank you for formally accepting my answer

Greg

1 Comment
Show -1 older commentsHide -1 older comments

Harold on 10 May 2013

Thank you Greg, I will have to try dividerand before training like you suggested. As of right now, I do not do any kind of validation or testing.

Sign in to comment.

Custom performance vectors for neural network training

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

Custom performance vectors for neural network training

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments