Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Age distributions, distance and shape
Date: Wed, 4 Nov 2009 16:17:19 +0000 (UTC)
Organization: The MathWorks, Inc.
Lines: 14
Message-ID: <hcs9ef$bvl$1@fred.mathworks.com>
References: <hcrl56$4d3$1@fred.mathworks.com> <hcs0n2$l1s$1@fred.mathworks.com> <hcs3kb$qfv$1@fred.mathworks.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-03-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1257351439 12277 172.30.248.38 (4 Nov 2009 16:17:19 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Wed, 4 Nov 2009 16:17:19 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1886545
Xref: news.mathworks.com comp.soft-sys.matlab:582431


 TO: Peter Perkins <Peter.Perkins@MathRemoveThisWorks.com> wrote in message 
I'm facing now the problem which arises from the practical and statistical significance (I read a post where you discussed about it).

Since i test with the "kstest2" two dataset which are large (ex: for district "x" on year "0"  population totals 1,788,122 units, while sample totals 693,058) i obtain a pValue of 0.

my datasets are organized as follows:
| age | # people |         
    18       20012
    19       67238
    20        etc...
Both population and sample dataset have the same discrete range of age [18-100], therefore
can i supply the "kstest2" with the probabilities that a certain age appears in the dataset instead of the entire dataset of replicated ages?

Tnx in advance