X-Received: by 10.224.18.199 with SMTP id x7mr3328614qaa.1.1360963639349;
        Fri, 15 Feb 2013 13:27:19 -0800 (PST)
MIME-Version: 1.0
X-Received: by 10.49.12.238 with SMTP id b14mr180069qec.18.1360963639303; Fri,
 15 Feb 2013 13:27:19 -0800 (PST)
Path: news.mathworks.com!newsfeed-00.mathworks.com!news.kjsl.com!news.glorb.com!t1no1940836qaz.0!news-out.google.com!k2ni33289qap.0!nntp.google.com!t1no1940831qaz.0!postnews.google.com!hl5g2000vbb.googlegroups.com!not-for-mail
Newsgroups: comp.soft-sys.matlab,comp.ai.neural-nets
Date: Fri, 15 Feb 2013 13:27:19 -0800 (PST)
Complaints-To: groups-abuse@google.com
Injection-Info: hl5g2000vbb.googlegroups.com; posting-host=70.215.64.120; posting-account=eN66xwoAAACcrVy_A6ukr6atsHzaxk64
NNTP-Posting-Host: 70.215.64.120
References: <kfm83m$p3n$1@newscl01ah.mathworks.com>
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64;
 Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR
 3.0.30729; Media Center PC 6.0; BRI/2; NP06; .NET4.0C; AskTbORJ/5.15.9.29495),gzip(gfe)
Message-ID: <2acac421-fcb9-445a-aa38-567df1d6dab5@hl5g2000vbb.googlegroups.com>
Subject: Re: CROSSVALIDATION TRAINING FOR NEURAL NETWORKS
From: Greg Heath <g.heath@verizon.net>
Injection-Date: Fri, 15 Feb 2013 21:27:19 +0000
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Lines: 57
Xref: news.mathworks.com comp.soft-sys.matlab:789151 comp.ai.neural-nets:82702

On Feb 15, 4:08pm, "Greg Heath" <he...@alumni.brown.edu> wrote:
> Although f-fold XVAL is a very good and popular way for designing and
> testing NNs, there is no MATLAB NNTBX function for doing so. I have
> done it for classification and regression by brute force using do loops.
> Unfortunately, that code is not available since my old computer crashed.
> Nevertheless, the coding was straightforward because random weights
> are automatically assigned when the obsolete functions NEWPR, NEWFIT
> or NEWFF are created. With the current functions FITNET, PATTERNNET
> and FEEDFORWARDNET, you either have to use a separate step with
> CONFIGURE or let the TRAIN function initialize the weights.
>
> I did add a few modifications. With f >=3 partitions there are F =
> f*(f-1)/2 ( = 10 for f = 5) ways to choose a holdout nontraining pair for
> validation and testing. So, for each of F (=10) nets I trained untill
> BOTH holdout set errors were minimized. I then obtained 2 nontraining
> estimates for error: The error of holdout2 at the minimum error of
> holdout1 and vice versa. Therefore for f = 5, I get 2*F = 20 holdout
> error estimates which is a reasonable value for obtaining
> min/median/mean/std/max (or even histogram) summary statistics.
>
> However, I did get a query from one statistician who felt that somehow
> the multiplication factor of F/f = (f-1) ( = 4 for f=5) is biased. The
> factor of f-1 is from the use of f-1 different validation sets for each
> of f test sets. However, the f-1 different validation sets correspond to
> f-1 different training sets, so I don't worry about it.
>
> In addition, for each of the F nets you can run Ntrials different weight
> initializations. So, for f = 5, Ntrials = 5 you can get F*Ntrials = 100
> error estimates.
>
> With timeseries, preserving order and uniform spacing is essential. I
> have only used f-fold XVAL with f = 3 and either DIVIDEBLOCK or
> DIVIDEINT types of data division. In the latter case the spacing is
> tripled and success will depend on the difference of the significant
> lags of the auto and/or cross correlation functions. The trick of two
> holdout error estimates per net still works. Therefore, you can get
> 2*Ntrials error estimates. I can't see any other way to preserve
> order and uniform spacing.
>
> The MATLAB commands
>
> lookfor crossvalidation
> lookfor 'cross validation'
> lookfor validation
>
> may yield functions from other toolboxes which may be of use.
> However I think the use of 2 holdout nontraining subsets is
> unique to neural network training.
>
> Hope this helps.
>
> Greg