From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: k (3) Vectors of ones and zeros "combining" together
Date: Mon, 23 Jan 2012 22:01:10 +0000 (UTC)
Organization: The MathWorks, Inc.
Lines: 7
Message-ID: <jfklb6$d6f$>
References: <jf510f$nv5$> <jf51ot$q13$> <jf5309$a8$> <jf6n35$3kn$> <jf71oe$bbi$> <jf7l1a$h8i$> <jf9nbq$b9h$> <jfhofo$l7e$> <jfi6j5$2dv$> <jfir93$11i$> <jfjtjr$fij$>
Reply-To: <HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: 1327356070 13519 (23 Jan 2012 22:01:10 GMT)
NNTP-Posting-Date: Mon, 23 Jan 2012 22:01:10 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1187260
Xref: comp.soft-sys.matlab:755533

"Matt J" wrote in message <jfjtjr$fij$>...
> Thanks, Roger, although I am a little disquieted that it takes 1e7 realizations to get a decently converged histogram. I would think that with only 150 different possible events, the number of required samples would be far less...
> ........
- - - - - - - - - - -
  It's not so surprising, Matt.  For N = 5, assuming that all 150 events have equal probability p = 1/150, what you have with n samples is a simple binomial distribution with mean n*p and variance n*p*(1-p) on the count of one of these events.  The right quantity to consider in its histogram is the ratio sqrt(variance)/mean which is equal to sqrt((1-p)/(n*p)).  As n increases, it drops down only as the reciprocal of the square root of n and the small size of p makes it worse.  For larger N the number of events increases and therefore the number of samples needed gets even larger.  Also one needs to take into consideration on a histogram that with larger numbers of events, there is an increased opportunity for outliers to appear even for a given value of the above ratio.

Roger Stafford