Path: news.mathworks.com!not-for-mail
From: "Tom Lane" <tlane@mathworks.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: chi-square two sample test
Date: Wed, 31 Dec 2008 10:19:39 -0500
Organization: The MathWorks, Inc
Lines: 31
Message-ID: <gjg2ic$foi$1@fred.mathworks.com>
References: <g2ls65$p3u$1@fred.mathworks.com> <g2lvan$1a3$1@fred.mathworks.com> <g2ntgq$r5e$1@fred.mathworks.com> <g2onqt$rmv$2@fred.mathworks.com> <g2qavh$399$1@fred.mathworks.com> <g2r8jt$den$2@fred.mathworks.com> <gje1t2$lul$1@fred.mathworks.com>
Reply-To: "Tom Lane" <tlane@mathworks.com>
NNTP-Posting-Host: 172.30.228.141
X-Trace: fred.mathworks.com 1230736780 16146 172.30.228.141 (31 Dec 2008 15:19:40 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Wed, 31 Dec 2008 15:19:40 +0000 (UTC)
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.5512
X-RFC2646: Format=Flowed; Original
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579
Xref: news.mathworks.com comp.soft-sys.matlab:509340


> I'm solving a problem regarding finding the chi-square goodness of fit, 
> and I was wondering if I was making the same mistake.
> I have a data sample which is autoscaled (normalized) up to the 2nd order. 
> First, I find a fit for the data using normfit and normpdf functions. 
> Next, I use the first 4 moments of the sample to find a Pearson 
> estimation. Now, If I want to compare the goodness of fit for each 
> methods, should I use the two-sample Chi-square test to compare the pdf 
> funtions? I mean, once to compare the pdf of the autoscaled sample with 
> that of the normpdf, and then pdf of the autoscaled sample with that of 
> the Pearson-generated sample?

Mastaneh, when you compare the sample to the normal distribution, that is a 
one-sample test.  You could use a chi-square test or any of several other 
tests.

For the Pearson comparison, it sounds like you really want to do a 
one-sample test again, comparing the observed sample with expected values 
under the Pearson distribution.  The Statistics Toolbox, though, has a 
function for generating random Pearson values but not for computing the cdf 
of this distribution.  Is that the issue?

You may be able to poke around at the code for pearsrnd and figure out how 
to compute the cdf for some cases.  Alternatively I suppose you could 
generate an enormous number of random values to estimate the expected bin 
proportions, then regard them as fixed.  There is not a function in the 
toolbox for comparing two finite samples to see if they have the same 
distribution via a chi-square test.

-- Tom