Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
CROSSVALIDATION TRAINING FOR NEURAL NETWORKS

Subject: CROSSVALIDATION TRAINING FOR NEURAL NETWORKS

From: Greg Heath

Date: 15 Feb, 2013 21:08:06

Message: 1 of 2

Although f-fold XVAL is a very good and popular way for designing and
testing NNs, there is no MATLAB NNTBX function for doing so. I have
done it for classification and regression by brute force using do loops.
Unfortunately, that code is not available since my old computer crashed.
Nevertheless, the coding was straightforward because random weights
are automatically assigned when the obsolete functions NEWPR, NEWFIT
or NEWFF are created. With the current functions FITNET, PATTERNNET
and FEEDFORWARDNET, you either have to use a separate step with
CONFIGURE or let the TRAIN function initialize the weights.

I did add a few modifications. With f >=3 partitions there are F =
f*(f-1)/2 ( = 10 for f = 5) ways to choose a holdout nontraining pair for
validation and testing. So, for each of F (=10) nets I trained untill
BOTH holdout set errors were minimized. I then obtained 2 nontraining
estimates for error: The error of holdout2 at the minimum error of
holdout1 and vice versa. Therefore for f = 5, I get 2*F = 20 holdout
error estimates which is a reasonable value for obtaining
min/median/mean/std/max (or even histogram) summary statistics.

However, I did get a query from one statistician who felt that somehow
the multiplication factor of F/f = (f-1) ( = 4 for f=5) is biased. The
factor of f-1 is from the use of f-1 different validation sets for each
of f test sets. However, the f-1 different validation sets correspond to
f-1 different training sets, so I don't worry about it.

In addition, for each of the F nets you can run Ntrials different weight
initializations. So, for f = 5, Ntrials = 5 you can get F*Ntrials = 100
error estimates.

With timeseries, preserving order and uniform spacing is essential. I
have only used f-fold XVAL with f = 3 and either DIVIDEBLOCK or
DIVIDEINT types of data division. In the latter case the spacing is
tripled and success will depend on the difference of the significant
lags of the auto and/or cross correlation functions. The trick of two
holdout error estimates per net still works. Therefore, you can get
2*Ntrials error estimates. I can't see any other way to preserve
order and uniform spacing.

The MATLAB commands

lookfor crossvalidation
lookfor 'cross validation'
lookfor validation

may yield functions from other toolboxes which may be of use.
However I think the use of 2 holdout nontraining subsets is
unique to neural network training.

Hope this helps.

Greg

Subject: CROSSVALIDATION TRAINING FOR NEURAL NETWORKS

From: Greg Heath

Date: 15 Feb, 2013 21:27:19

Message: 2 of 2

On Feb 15, 4:08pm, "Greg Heath" <he...@alumni.brown.edu> wrote:
> Although f-fold XVAL is a very good and popular way for designing and
> testing NNs, there is no MATLAB NNTBX function for doing so. I have
> done it for classification and regression by brute force using do loops.
> Unfortunately, that code is not available since my old computer crashed.
> Nevertheless, the coding was straightforward because random weights
> are automatically assigned when the obsolete functions NEWPR, NEWFIT
> or NEWFF are created. With the current functions FITNET, PATTERNNET
> and FEEDFORWARDNET, you either have to use a separate step with
> CONFIGURE or let the TRAIN function initialize the weights.
>
> I did add a few modifications. With f >=3 partitions there are F =
> f*(f-1)/2 ( = 10 for f = 5) ways to choose a holdout nontraining pair for
> validation and testing. So, for each of F (=10) nets I trained untill
> BOTH holdout set errors were minimized. I then obtained 2 nontraining
> estimates for error: The error of holdout2 at the minimum error of
> holdout1 and vice versa. Therefore for f = 5, I get 2*F = 20 holdout
> error estimates which is a reasonable value for obtaining
> min/median/mean/std/max (or even histogram) summary statistics.
>
> However, I did get a query from one statistician who felt that somehow
> the multiplication factor of F/f = (f-1) ( = 4 for f=5) is biased. The
> factor of f-1 is from the use of f-1 different validation sets for each
> of f test sets. However, the f-1 different validation sets correspond to
> f-1 different training sets, so I don't worry about it.
>
> In addition, for each of the F nets you can run Ntrials different weight
> initializations. So, for f = 5, Ntrials = 5 you can get F*Ntrials = 100
> error estimates.
>
> With timeseries, preserving order and uniform spacing is essential. I
> have only used f-fold XVAL with f = 3 and either DIVIDEBLOCK or
> DIVIDEINT types of data division. In the latter case the spacing is
> tripled and success will depend on the difference of the significant
> lags of the auto and/or cross correlation functions. The trick of two
> holdout error estimates per net still works. Therefore, you can get
> 2*Ntrials error estimates. I can't see any other way to preserve
> order and uniform spacing.
>
> The MATLAB commands
>
> lookfor crossvalidation
> lookfor 'cross validation'
> lookfor validation
>
> may yield functions from other toolboxes which may be of use.
> However I think the use of 2 holdout nontraining subsets is
> unique to neural network training.
>
> Hope this helps.
>
> Greg

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us