Path: news.mathworks.com!not-for-mail
From: "giannis " <fanzio12@yahoo.co.uk>
Newsgroups: comp.soft-sys.matlab
Subject: Re: small data set
Date: Mon, 5 May 2008 21:59:04 +0000 (UTC)
Organization: University of Sussex
Lines: 75
Message-ID: <fvnvv8$jgg$1@fred.mathworks.com>
References: <fvc63i$qfc$1@fred.mathworks.com> <afd7b073-23ba-496a-865f-b5655a22c64e@f36g2000hsa.googlegroups.com> <9b4c2a53-7f64-42a4-a546-5a8e0f9e2cb9@k13g2000hse.googlegroups.com>
Reply-To: "giannis " <fanzio12@yahoo.co.uk>
NNTP-Posting-Host: webapp-02-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1210024744 19984 172.30.248.37 (5 May 2008 21:59:04 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Mon, 5 May 2008 21:59:04 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 376695
Xref: news.mathworks.com comp.soft-sys.matlab:466788


Hello Greg,

thank you for all your help.

I have data from 25 people. 20 of them have lung cancer and
5 don't. I have 6 different characteristic for each person.
(so the array is 25X6)

the tasks are:to produce two classifiers
1st: to classify between a constant value - 2 outputs)
2nd: to classify the stage of cancer 0,1,2,3 or 4 so - 5
outputs)    

I tried to use SVM, Linear regresion, Backpropagation and
RBF Neural Nets and KNN.

I tried to reshuffle my data using Leave One Out Cross
Validation (LOOCV) so keeping each time one for testing and
24 for training.

hope I gave you the picture..?
 


Greg Heath <heath@alumni.brown.edu> wrote in message
<9b4c2a53-7f64-42a4-a546-5a8e0f9e2cb9@k13g2000hse.googlegroups.com>...
> On May 1, 7:22=A0am, Greg Heath <he...@alumni.brown.edu>
wrote:
> > On May 1, 6:30=A0am, "giannis " <fanzi...@yahoo.co.uk>
wrote:
> >
> > > Hello.
> >
> > > I am doing a statistical research using KNN,neuralnets and
> > > SVM.. The problem is the very small data set (25
speciments).
> >
> > > I am using cross validation to resample the data but I am
> > > not sure if my results can be accurate with such a small
> > > data set.
> >
> > > can you please suggest any method to use as best as
possible
> > > =A0such a small data set?
> > > thank you in advance =A0
> >
> > Bootstrapping
> >
> > Search the mathworks website.
> 
> If you have prior information on the form of the probability
> distribution function, you can use the 25 observations to
> estimate the parameters and then generate more "data".
> The danger is that, even in one dimension, 25 observations
> will not give you precise parameter estimates.
> 
> If you don't have such prior information you can test
> hypotheses as to which distribution the data might be
> from. However, with only 25 observations the testing will
> be far from definitive. You may test several distributions,
> find that you can reject all except one. However, that does
> not guarantee that it will be the correct distribution.
> 
> =2E..suddenly I have the feeling that the data is not
> 1-dimensional!
> 
> What are the dimensions of your input and output?
> Exactly what type of problem do you have and what
> exactly do you want the neural net to do?
> 
> Hope this helps.
> 
> Greg
>