<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/170718</link>
    <title>MATLAB Central Newsreader - chi-square two sample test</title>
    <description>Feed for thread: chi-square two sample test</description>
    <language>en-us</language>
    <copyright>&amp;copy;1994-2012 by MathWorks, Inc.</copyright>
    <webmaster>webmaster@mathworks.com</webmaster>
    <generator>MATLAB Central Newsreader</generator>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <ttl>60</ttl>
    <image>
      <title>MathWorks</title>
      <url>http://www.mathworks.com/images/membrane_icon.gif</url>
    </image>
    <item>
      <pubDate>Tue, 10 Jun 2008 12:31:01 -0400</pubDate>
      <title>chi-square two sample test</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/170718#436684</link>
      <author>wang wang</author>
      <description>I have a data set,and I estimate the parameters of a &lt;br&gt;
specified distribution,say the lognormal distribution,from &lt;br&gt;
that data set. Now I want to test the goodness of fit of &lt;br&gt;
that distribution to the data(notice particularly that the &lt;br&gt;
parameters of the distribution is estimated from the data &lt;br&gt;
set).Can anyone tell me whether the chi-square two sample &lt;br&gt;
test is adequate to do that or not? And is there a &lt;br&gt;
exsiting function to do that work? I have known that there &lt;br&gt;
exist a function named'chi2gof'which can do the chi-square &lt;br&gt;
goodness-of-fit test,but this function only tests whether &lt;br&gt;
the data comes from a normal distribution. </description>
    </item>
    <item>
      <pubDate>Tue, 10 Jun 2008 13:24:39 -0400</pubDate>
      <title>Re: chi-square two sample test</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/170718#436696</link>
      <author>Peter Perkins</author>
      <description>wang wang wrote:&lt;br&gt;
&amp;gt; I have a data set,and I estimate the parameters of a &lt;br&gt;
&amp;gt; specified distribution,say the lognormal distribution,from &lt;br&gt;
&amp;gt; that data set. Now I want to test the goodness of fit of &lt;br&gt;
&amp;gt; that distribution to the data(notice particularly that the &lt;br&gt;
&amp;gt; parameters of the distribution is estimated from the data &lt;br&gt;
&amp;gt; set).Can anyone tell me whether the chi-square two sample &lt;br&gt;
&amp;gt; test is adequate to do that or not?&lt;br&gt;
&lt;br&gt;
Is there a reason why you want to use a two-sample test?  Your description &lt;br&gt;
sounds like you want the usual chi-squared test against a parametric &lt;br&gt;
distribution for which you have estimated the parameters.&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&amp;gt; And is there a &lt;br&gt;
&amp;gt; exsiting function to do that work?  I have known that there&lt;br&gt;
&amp;gt; exist a function named'chi2gof'which can do the chi-square &lt;br&gt;
&amp;gt; goodness-of-fit test,but this function only tests whether &lt;br&gt;
&amp;gt; the data comes from a normal distribution. &lt;br&gt;
&lt;br&gt;
That's not correct.  CHI2GOF allows you to specify any distribution you want, in &lt;br&gt;
&amp;nbsp;&amp;nbsp;a couple fo different ways.</description>
    </item>
    <item>
      <pubDate>Wed, 11 Jun 2008 07:06:02 -0400</pubDate>
      <title>Re: chi-square two sample test</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/170718#436863</link>
      <author>wang wang</author>
      <description>Peter Perkins &amp;lt;Peter.PerkinsRemoveThis@mathworks.com&amp;gt; &lt;br&gt;
wrote in message &amp;lt;g2lvan$1a3$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; wang wang wrote:&lt;br&gt;
&amp;gt; &amp;gt; I have a data set,and I estimate the parameters of a &lt;br&gt;
&amp;gt; &amp;gt; specified distribution,say the lognormal &lt;br&gt;
distribution,from &lt;br&gt;
&amp;gt; &amp;gt; that data set. Now I want to test the goodness of fit &lt;br&gt;
of &lt;br&gt;
&amp;gt; &amp;gt; that distribution to the data(notice particularly that &lt;br&gt;
the &lt;br&gt;
&amp;gt; &amp;gt; parameters of the distribution is estimated from the &lt;br&gt;
data &lt;br&gt;
&amp;gt; &amp;gt; set).Can anyone tell me whether the chi-square two &lt;br&gt;
sample &lt;br&gt;
&amp;gt; &amp;gt; test is adequate to do that or not?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Is there a reason why you want to use a two-sample &lt;br&gt;
test?  Your description &lt;br&gt;
&amp;gt; sounds like you want the usual chi-squared test against &lt;br&gt;
a parametric &lt;br&gt;
&amp;gt; distribution for which you have estimated the parameters.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;nbsp;Peter&amp;#65292;thanks for your reply.The parametric distribution &lt;br&gt;
which I used to fit the data set does not have an &lt;br&gt;
analytical expression.Only it's characteristic function is &lt;br&gt;
known.So I think maybe I can use the two sample test.I &lt;br&gt;
donnot know whether this is correct, please tell me if &lt;br&gt;
it's wrong.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; And is there a &lt;br&gt;
&amp;gt; &amp;gt; exsiting function to do that work?  I have known that &lt;br&gt;
there&lt;br&gt;
&amp;gt; &amp;gt; exist a function named'chi2gof'which can do the chi-&lt;br&gt;
square &lt;br&gt;
&amp;gt; &amp;gt; goodness-of-fit test,but this function only tests &lt;br&gt;
whether &lt;br&gt;
&amp;gt; &amp;gt; the data comes from a normal distribution. &lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; That's not correct.  CHI2GOF allows you to specify any &lt;br&gt;
distribution you want, in &lt;br&gt;
&amp;gt;   a couple fo different ways.&lt;br&gt;
&lt;br&gt;
yes,I made a mistake about this.</description>
    </item>
    <item>
      <pubDate>Wed, 11 Jun 2008 14:35:09 -0400</pubDate>
      <title>Re: chi-square two sample test</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/170718#436940</link>
      <author>Peter Perkins</author>
      <description>wang wang wrote:&lt;br&gt;
&lt;br&gt;
&amp;gt;  Peter&amp;#65292;thanks for your reply.The parametric distribution &lt;br&gt;
&amp;gt; which I used to fit the data set does not have an &lt;br&gt;
&amp;gt; analytical expression.Only it's characteristic function is &lt;br&gt;
&amp;gt; known.So I think maybe I can use the two sample test.I &lt;br&gt;
&amp;gt; donnot know whether this is correct, please tell me if &lt;br&gt;
&amp;gt; it's wrong.&lt;br&gt;
&lt;br&gt;
My question would be, &quot;what's your second sample?&quot;</description>
    </item>
    <item>
      <pubDate>Thu, 12 Jun 2008 05:08:01 -0400</pubDate>
      <title>Re: chi-square two sample test</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/170718#437067</link>
      <author>wang wang</author>
      <description>Peter Perkins &amp;lt;Peter.PerkinsRemoveThis@mathworks.com&amp;gt; &lt;br&gt;
wrote in message &amp;lt;g2onqt$rmv$2@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; wang wang wrote:&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; &amp;gt;  Peter&amp;#65292;thanks for your reply.The parametric &lt;br&gt;
distribution &lt;br&gt;
&amp;gt; &amp;gt; which I used to fit the data set does not have an &lt;br&gt;
&amp;gt; &amp;gt; analytical expression.Only it's characteristic &lt;br&gt;
function is &lt;br&gt;
&amp;gt; &amp;gt; known.So I think maybe I can use the two sample test.I &lt;br&gt;
&amp;gt; &amp;gt; donnot know whether this is correct, please tell me if &lt;br&gt;
&amp;gt; &amp;gt; it's wrong.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; My question would be, &quot;what's your second sample?&quot;&lt;br&gt;
&lt;br&gt;
Thank you Peter,the second sample can be generated from &lt;br&gt;
the specific distribution whose parameters are estimated &lt;br&gt;
from the data set.This sample can be seen as come from the &lt;br&gt;
specific distribution,then maybe the chi-square two sample &lt;br&gt;
test can be used to test whether the generated sample and &lt;br&gt;
the original data set come from a common distribution</description>
    </item>
    <item>
      <pubDate>Thu, 12 Jun 2008 13:33:49 -0400</pubDate>
      <title>Re: chi-square two sample test</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/170718#437134</link>
      <author>Peter Perkins</author>
      <description>wang wang wrote:&lt;br&gt;
&lt;br&gt;
&amp;gt; Thank you Peter,the second sample can be generated from &lt;br&gt;
&amp;gt; the specific distribution whose parameters are estimated &lt;br&gt;
&amp;gt; from the data set.This sample can be seen as come from the &lt;br&gt;
&amp;gt; specific distribution,then maybe the chi-square two sample &lt;br&gt;
&amp;gt; test can be used to test whether the generated sample and &lt;br&gt;
&amp;gt; the original data set come from a common distribution&lt;br&gt;
&lt;br&gt;
I don't know what to tell you.  Your description of your context is exactly what &lt;br&gt;
a one-sample chi-squared test is for.  I don't know why you would want to &lt;br&gt;
artificially introduce another sample.&lt;br&gt;
&lt;br&gt;
How your distribution is defined makes no difference at all.  You are fitting a &lt;br&gt;
distribution to data by estimating its parameters.  If you can compute &lt;br&gt;
cumulative probabilities from that fitted distribution, then that's all you need &lt;br&gt;
to use chi2gof.  If your problem is that you can't compute cumulative &lt;br&gt;
probabilites, then I wonder how useful your model will actually be from a &lt;br&gt;
predictive point of view.</description>
    </item>
    <item>
      <pubDate>Tue, 30 Dec 2008 20:56:02 -0500</pubDate>
      <title>Re: chi-square two sample test</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/170718#619321</link>
      <author>Mastaneh </author>
      <description>Peter Perkins &amp;lt;Peter.PerkinsRemoveThis@mathworks.com&amp;gt; wrote in message &lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; I don't know what to tell you.  Your description of your context is exactly what &lt;br&gt;
&amp;gt; a one-sample chi-squared test is for.  I don't know why you would want to &lt;br&gt;
&amp;gt; artificially introduce another sample.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; How your distribution is defined makes no difference at all.  You are fitting a &lt;br&gt;
&amp;gt; distribution to data by estimating its parameters.  If you can compute &lt;br&gt;
&amp;gt; cumulative probabilities from that fitted distribution, then that's all you need &lt;br&gt;
&amp;gt; to use chi2gof.  If your problem is that you can't compute cumulative &lt;br&gt;
&amp;gt; probabilites, then I wonder how useful your model will actually be from a &lt;br&gt;
&amp;gt; predictive point of view.&lt;br&gt;
&lt;br&gt;
Hi,&lt;br&gt;
I'm solving a problem regarding finding the chi-square goodness of fit, and I was wondering if I was making the same mistake. &lt;br&gt;
I have a data sample which is autoscaled (normalized) up to the 2nd order. First, I find a fit for the data using normfit and normpdf functions. Next, I use the first 4 moments of the sample to find a Pearson estimation. Now, If I want to compare the goodness of fit for each methods, should I use the two-sample Chi-square test to compare the pdf funtions? I mean, once to compare the pdf of the autoscaled sample with that of the normpdf, and then pdf of the autoscaled sample with that of the Pearson-generated sample? &lt;br&gt;
&lt;br&gt;
Thanks,&lt;br&gt;
Mastaneh</description>
    </item>
    <item>
      <pubDate>Wed, 31 Dec 2008 15:19:39 -0500</pubDate>
      <title>Re: chi-square two sample test</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/170718#619395</link>
      <author>Tom Lane</author>
      <description>&amp;gt; I'm solving a problem regarding finding the chi-square goodness of fit, &lt;br&gt;
&amp;gt; and I was wondering if I was making the same mistake.&lt;br&gt;
&amp;gt; I have a data sample which is autoscaled (normalized) up to the 2nd order. &lt;br&gt;
&amp;gt; First, I find a fit for the data using normfit and normpdf functions. &lt;br&gt;
&amp;gt; Next, I use the first 4 moments of the sample to find a Pearson &lt;br&gt;
&amp;gt; estimation. Now, If I want to compare the goodness of fit for each &lt;br&gt;
&amp;gt; methods, should I use the two-sample Chi-square test to compare the pdf &lt;br&gt;
&amp;gt; funtions? I mean, once to compare the pdf of the autoscaled sample with &lt;br&gt;
&amp;gt; that of the normpdf, and then pdf of the autoscaled sample with that of &lt;br&gt;
&amp;gt; the Pearson-generated sample?&lt;br&gt;
&lt;br&gt;
Mastaneh, when you compare the sample to the normal distribution, that is a &lt;br&gt;
one-sample test.  You could use a chi-square test or any of several other &lt;br&gt;
tests.&lt;br&gt;
&lt;br&gt;
For the Pearson comparison, it sounds like you really want to do a &lt;br&gt;
one-sample test again, comparing the observed sample with expected values &lt;br&gt;
under the Pearson distribution.  The Statistics Toolbox, though, has a &lt;br&gt;
function for generating random Pearson values but not for computing the cdf &lt;br&gt;
of this distribution.  Is that the issue?&lt;br&gt;
&lt;br&gt;
You may be able to poke around at the code for pearsrnd and figure out how &lt;br&gt;
to compute the cdf for some cases.  Alternatively I suppose you could &lt;br&gt;
generate an enormous number of random values to estimate the expected bin &lt;br&gt;
proportions, then regard them as fixed.  There is not a function in the &lt;br&gt;
toolbox for comparing two finite samples to see if they have the same &lt;br&gt;
distribution via a chi-square test.&lt;br&gt;
&lt;br&gt;
-- Tom </description>
    </item>
    <item>
      <pubDate>Sat, 03 Jan 2009 02:29:01 -0500</pubDate>
      <title>Re: chi-square two sample test</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/170718#619646</link>
      <author>Mastaneh </author>
      <description>&quot;Tom Lane&quot; &amp;lt;tlane@mathworks.com&amp;gt; wrote in message &amp;lt;gjg2ic$foi$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; &amp;gt; I'm solving a problem regarding finding the chi-square goodness of fit, &lt;br&gt;
&amp;gt; &amp;gt; and I was wondering if I was making the same mistake.&lt;br&gt;
&amp;gt; &amp;gt; I have a data sample which is autoscaled (normalized) up to the 2nd order. &lt;br&gt;
&amp;gt; &amp;gt; First, I find a fit for the data using normfit and normpdf functions. &lt;br&gt;
&amp;gt; &amp;gt; Next, I use the first 4 moments of the sample to find a Pearson &lt;br&gt;
&amp;gt; &amp;gt; estimation. Now, If I want to compare the goodness of fit for each &lt;br&gt;
&amp;gt; &amp;gt; methods, should I use the two-sample Chi-square test to compare the pdf &lt;br&gt;
&amp;gt; &amp;gt; funtions? I mean, once to compare the pdf of the autoscaled sample with &lt;br&gt;
&amp;gt; &amp;gt; that of the normpdf, and then pdf of the autoscaled sample with that of &lt;br&gt;
&amp;gt; &amp;gt; the Pearson-generated sample?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Mastaneh, when you compare the sample to the normal distribution, that is a &lt;br&gt;
&amp;gt; one-sample test.  You could use a chi-square test or any of several other &lt;br&gt;
&amp;gt; tests.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; For the Pearson comparison, it sounds like you really want to do a &lt;br&gt;
&amp;gt; one-sample test again, comparing the observed sample with expected values &lt;br&gt;
&amp;gt; under the Pearson distribution.  The Statistics Toolbox, though, has a &lt;br&gt;
&amp;gt; function for generating random Pearson values but not for computing the cdf &lt;br&gt;
&amp;gt; of this distribution.  Is that the issue?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; You may be able to poke around at the code for pearsrnd and figure out how &lt;br&gt;
&amp;gt; to compute the cdf for some cases.  Alternatively I suppose you could &lt;br&gt;
&amp;gt; generate an enormous number of random values to estimate the expected bin &lt;br&gt;
&amp;gt; proportions, then regard them as fixed.  There is not a function in the &lt;br&gt;
&amp;gt; toolbox for comparing two finite samples to see if they have the same &lt;br&gt;
&amp;gt; distribution via a chi-square test.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; -- Tom &lt;br&gt;
&amp;gt; &lt;br&gt;
&lt;br&gt;
Thanks Tom. &lt;br&gt;
Yes, using the Pearson function from the Statistics Toolbox I could only generate the random distribution. I'ts not possible to find the pdf or cdf of that distribution directly so I used the histogram function to find the frequency counts and bin locations of both samples (my data and pearson). Then I used the algorithm in &lt;a href=&quot;http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/chi2samp.htm&quot;&gt;http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/chi2samp.htm&lt;/a&gt; to calculate the test statistics. Here it says that the 2-samples test is based on the binning of data so I thought it's what I should use anyway. Does it sound right? &lt;br&gt;
I suppose it should be similar to your idea about having fixed bin proportions. Am I right? &lt;br&gt;
&lt;br&gt;
Thanks once again,&lt;br&gt;
Mastaneh </description>
    </item>
    <item>
      <pubDate>Mon, 05 Jan 2009 15:30:00 -0500</pubDate>
      <title>Re: chi-square two sample test</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/170718#619910</link>
      <author>Tom Lane</author>
      <description>&amp;gt; Yes, using the Pearson function from the Statistics Toolbox I could only &lt;br&gt;
&amp;gt; generate the random distribution. I'ts not possible to find the pdf or cdf &lt;br&gt;
&amp;gt; of that distribution directly so I used the histogram function to find the &lt;br&gt;
&amp;gt; frequency counts and bin locations of both samples (my data and pearson). &lt;br&gt;
&amp;gt; Then I used the algorithm in &lt;br&gt;
&amp;gt; &lt;a href=&quot;http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/chi2samp.htm&quot;&gt;http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/chi2samp.htm&lt;/a&gt; &lt;br&gt;
&amp;gt; to calculate the test statistics. Here it says that the 2-samples test is &lt;br&gt;
&amp;gt; based on the binning of data so I thought it's what I should use anyway. &lt;br&gt;
&amp;gt; Does it sound right?&lt;br&gt;
&amp;gt; I suppose it should be similar to your idea about having fixed bin &lt;br&gt;
&amp;gt; proportions. Am I right?&lt;br&gt;
&lt;br&gt;
Mastaneh, sure, you could use a two-sample test if you want.  It's &lt;br&gt;
introducing extra variability (from the Pearson random sample), so it would &lt;br&gt;
probably be less sensitive than a one-sample test, but I suppose it would be &lt;br&gt;
valid.  You won't be able to use the chi2gof function, though -- that is for &lt;br&gt;
one-sample tests.&lt;br&gt;
&lt;br&gt;
-- Tom </description>
    </item>
  </channel>
</rss>

