Path: news.mathworks.com!newsfeed-00.mathworks.com!nlpi057.nbdc.sbc.com!prodigy.net!news.glorb.com!postnews.google.com!r15g2000prd.googlegroups.com!not-for-mail
From: Greg Heath <heath@alumni.brown.edu>
Newsgroups: comp.soft-sys.matlab
Subject: Re: crossvalind -- size of training/testing set?
Date: Tue, 3 Feb 2009 10:25:00 -0800 (PST)
Organization: http://groups.google.com
Lines: 31
Message-ID: <018dfdcd-8bf4-4b63-a62e-3ea205bae782@r15g2000prd.googlegroups.com>
References: <gm58ci$hrf$1@fred.mathworks.com> <b70e2def-294a-4bce-adc0-cd2681cd6014@k36g2000pri.googlegroups.com> 
	<gm7j8a$a2u$1@fred.mathworks.com> <gm7lsl$di9$1@fred.mathworks.com>
NNTP-Posting-Host: 68.39.98.10
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1233685500 4224 127.0.0.1 (3 Feb 2009 18:25:00 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Tue, 3 Feb 2009 18:25:00 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: r15g2000prd.googlegroups.com; posting-host=68.39.98.10; 
	posting-account=mUealwkAAACvQrLWvunjg50tRAnsNtJR
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; GTB5; 
	Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.1.4322; 
	.NET CLR 2.0.50727; .NET CLR 3.0.04506.30; Seekmo 10.0.341.0),gzip(gfe),gzip(gfe)
Xref: news.mathworks.com comp.soft-sys.matlab:515770


On Feb 2, 3:43 pm, "Sophia " <scyud...@mit.edu> wrote:
> Thanks for your responses. I can give you the dataset size, but a bigger =
question that I have in this context is -- why does it work fine with the s=
ame dataset size using "[train, test] =3D crossvalind('holdOut', groups);",=
 while explicitly specifying the training set size seems to require a whole=
 lot more memory?

Do both 'holdOut' and 'HoldOut' work?


> I will double check the documentation, but I couldn't seem to find any in=
fo regarding which data subdivision P corresponds to -- is P the proportion=
 of data going to training, or to testing?

When Method =3D 'HoldOut', P =3D the proportion held out.

> The dataset size is 5000.

How many classes and how many input variables? As you
can see from my previous posts

greg-heath pretraining advice
greg-heath Neq Nw

unless you are using overtraining mitigation, the minimum
size of Ntrn is determined by the number of inputs, hidden
nodes and classes.

Hope this helps.

Greg