X-Received: by 10.224.18.199 with SMTP id x7mr3328614qaa.1.1360963639349; Fri, 15 Feb 2013 13:27:19 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.49.12.238 with SMTP id b14mr180069qec.18.1360963639303; Fri, 15 Feb 2013 13:27:19 -0800 (PST) Path: news.mathworks.com!newsfeed-00.mathworks.com!news.kjsl.com!news.glorb.com!t1no1940836qaz.0!news-out.google.com!k2ni33289qap.0!nntp.google.com!t1no1940831qaz.0!postnews.google.com!hl5g2000vbb.googlegroups.com!not-for-mail Newsgroups: comp.soft-sys.matlab,comp.ai.neural-nets Date: Fri, 15 Feb 2013 13:27:19 -0800 (PST) Complaints-To: groups-abuse@google.com Injection-Info: hl5g2000vbb.googlegroups.com; posting-host=70.215.64.120; posting-account=eN66xwoAAACcrVy_A6ukr6atsHzaxk64 NNTP-Posting-Host: 70.215.64.120 References: <kfm83m$p3n$1@newscl01ah.mathworks.com> User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; BRI/2; NP06; .NET4.0C; AskTbORJ/5.15.9.29495),gzip(gfe) Message-ID: <2acac421-fcb9-445a-aa38-567df1d6dab5@hl5g2000vbb.googlegroups.com> Subject: Re: CROSSVALIDATION TRAINING FOR NEURAL NETWORKS From: Greg Heath <g.heath@verizon.net> Injection-Date: Fri, 15 Feb 2013 21:27:19 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Lines: 57 Xref: news.mathworks.com comp.soft-sys.matlab:789151 comp.ai.neural-nets:82702 On Feb 15, 4:08pm, "Greg Heath" <he...@alumni.brown.edu> wrote: > Although f-fold XVAL is a very good and popular way for designing and > testing NNs, there is no MATLAB NNTBX function for doing so. I have > done it for classification and regression by brute force using do loops. > Unfortunately, that code is not available since my old computer crashed. > Nevertheless, the coding was straightforward because random weights > are automatically assigned when the obsolete functions NEWPR, NEWFIT > or NEWFF are created. With the current functions FITNET, PATTERNNET > and FEEDFORWARDNET, you either have to use a separate step with > CONFIGURE or let the TRAIN function initialize the weights. > > I did add a few modifications. With f >=3 partitions there are F = > f*(f-1)/2 ( = 10 for f = 5) ways to choose a holdout nontraining pair for > validation and testing. So, for each of F (=10) nets I trained untill > BOTH holdout set errors were minimized. I then obtained 2 nontraining > estimates for error: The error of holdout2 at the minimum error of > holdout1 and vice versa. Therefore for f = 5, I get 2*F = 20 holdout > error estimates which is a reasonable value for obtaining > min/median/mean/std/max (or even histogram) summary statistics. > > However, I did get a query from one statistician who felt that somehow > the multiplication factor of F/f = (f-1) ( = 4 for f=5) is biased. The > factor of f-1 is from the use of f-1 different validation sets for each > of f test sets. However, the f-1 different validation sets correspond to > f-1 different training sets, so I don't worry about it. > > In addition, for each of the F nets you can run Ntrials different weight > initializations. So, for f = 5, Ntrials = 5 you can get F*Ntrials = 100 > error estimates. > > With timeseries, preserving order and uniform spacing is essential. I > have only used f-fold XVAL with f = 3 and either DIVIDEBLOCK or > DIVIDEINT types of data division. In the latter case the spacing is > tripled and success will depend on the difference of the significant > lags of the auto and/or cross correlation functions. The trick of two > holdout error estimates per net still works. Therefore, you can get > 2*Ntrials error estimates. I can't see any other way to preserve > order and uniform spacing. > > The MATLAB commands > > lookfor crossvalidation > lookfor 'cross validation' > lookfor validation > > may yield functions from other toolboxes which may be of use. > However I think the use of 2 holdout nontraining subsets is > unique to neural network training. > > Hope this helps. > > Greg