Skip to Main Content Skip to Search
Login
File Exchange
MATLAB Newsgroup
Link Exchange
  Blogs  
 Contest 
MathWorks.com

Thread Subject: Calculating cumulative probability

Subject: Calculating cumulative probability

From: Omkar Palsule-Desai

Date: 14 Feb, 2008 16:35:04

Message: 1 of 7


Hi,

Say, X & Y are two independent continuous random variables
with known distribution functions. Both X & Y range on (-
inf, +inf). I need to find following probability

(X-Y<= k; Y>=0; X>Y)

Since, I have two random variables and there are three
conditions in joint distribution, I can not use
traditional probability concepts.

I guess I can calculate probability by measuring volume
under the surfaces defined by X-Y<=k, Y>=0 and X>Y. Is
this a correct procedure?

If it is, then how do I calculate the volume using matlab?
Can someone help me with this?

Thank you

Omkar

Subject: Re: Calculating cumulative probability

From: Randy Poe

Date: 14 Feb, 2008 16:42:50

Message: 2 of 7

On Feb 14, 11:35 am, "Omkar Palsule-Desai" <omkar...@iimahd.ernet.in>
wrote:
> Hi,
>
> Say, X & Y are two independent continuous random variables
> with known distribution functions. Both X & Y range on (-
> inf, +inf). I need to find following probability
>
> (X-Y<= k; Y>=0; X>Y)
>
> Since, I have two random variables and there are three
> conditions in joint distribution, I can not use
> traditional probability concepts.

What?

This isn't a conditional probability as you stated it.
You defined three "events", and you are asking for
the probability of (A and B and C). These events
are dependent. There is nothing in traditional probability
theory that prevents you from considering the concept
P(A and B and C).

> I guess I can calculate probability by measuring volume
> under the surfaces defined by X-Y<=k, Y>=0 and X>Y. Is
> this a correct procedure?

Yes. The covers the region of (X,Y) where (A and B and C)
is true, and the integral of the density over that region
is the probability of (A and B and C).

> If it is, then how do I calculate the volume using matlab?

Numerically or symbolically?

Draw the region defined by these three conditions on
an x-y axis (on paper). See if you can express that as
a double integral over x and y. That will give you limits
which you can use in an integration routine, either numeric
(like quad2) or symbolic (I'm not very knowledgable about
that).

                - Randy

Subject: Re: Calculating cumulative probability

From: Roger Stafford

Date: 14 Feb, 2008 19:32:01

Message: 3 of 7

"Omkar Palsule-Desai" <omkardpd@iimahd.ernet.in> wrote in message
<fp1qjo$ka8$1@fred.mathworks.com>...
>
> Hi,
>
> Say, X & Y are two independent continuous random variables
> with known distribution functions. Both X & Y range on (-
> inf, +inf). I need to find following probability
>
> (X-Y<= k; Y>=0; X>Y)
>
> Since, I have two random variables and there are three
> conditions in joint distribution, I can not use
> traditional probability concepts.
>
> I guess I can calculate probability by measuring volume
> under the surfaces defined by X-Y<=k, Y>=0 and X>Y. Is
> this a correct procedure?
>
> If it is, then how do I calculate the volume using matlab?
> Can someone help me with this?
>
> Thank you
>
> Omkar
--------
  Since you know X and Y to be independent random variables, your desired
probability can be expressed as the iterated integral:

 P(k) = integral, 0 to inf, (integral, y to y+k, f(x)*g(y) dx) dy

where f(x) and g(y) are the respective probability density functions of X and Y.
That is: the integral from 0 to infinity of the integral from y to y+k of the
product f(x)*g(y), first with respect to x and then with respect to y. (I would
not have used the term "cumulative probability" to describe this particular
probability, however.) Unfortunately, matlab's 'dblquad' is unable to evaluate
a double integral in this form where the inner limits of integration depend on
the outer variable.

  I can see two possibilities for numerically calculating this using matlab.
First, if you already know the cumulative distribution of X as a function, F(x),
the above can then be directly expressed as the single integral:

 integral, 0 to inf, (F(y+k)-F(y))*g(y) dy

by direct substitution of F(y+k)-F(y) for the inner integral after factoring out
the g(y). Any of the matlab single quadrature functions can be used for this:
'quad', 'quad8', etc.

  The other possibility is to make a change of variable. Define

 u = x - y

Then the iterated integral can be converted to the form

 integral, 0 to inf, integral, 0 to k, f(u+y)*g(y) du dy

(Note that the Jacobian in this case is just 1.) This is in a form that can be
used by 'dblquad' since the inner limits are independent of y values. This is
the method you should use if you know only the two probability density
functions.

  Note that each of these forms requires a separate numerical integration for
each desired value of k. There is always the possibility that, depending on
the functions, f(x) and g(y), there may exist an analytic solution to these
integrals using the 'int' function of the Symbolic Toolbox, but the odds are
heavily against this I would think.

Roger Stafford

Subject: Re: Calculating cumulative probability

From: Roger Stafford

Date: 14 Feb, 2008 21:53:02

Message: 4 of 7

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in
message <fp24vh$49a$1@fred.mathworks.com>...
> Since you know X and Y to be independent random variables, your desired
> probability can be expressed as the iterated integral:
>
> P(k) = integral, 0 to inf, (integral, y to y+k, f(x)*g(y) dx) dy
>
> where f(x) and g(y) are the respective probability density functions of X and
Y.
> That is: the integral from 0 to infinity of the integral from y to y+k of the
> product f(x)*g(y), first with respect to x and then with respect to y. (I would
> not have used the term "cumulative probability" to describe this particular
> probability, however.) Unfortunately, matlab's 'dblquad' is unable to
evaluate
> a double integral in this form where the inner limits of integration depend
on
> the outer variable.
>
> I can see two possibilities for numerically calculating this using matlab.
> First, if you already know the cumulative distribution of X as a function, F
(x),
> the above can then be directly expressed as the single integral:
>
> integral, 0 to inf, (F(y+k)-F(y))*g(y) dy
>
> by direct substitution of F(y+k)-F(y) for the inner integral after factoring
out
> the g(y). Any of the matlab single quadrature functions can be used for
this:
> 'quad', 'quad8', etc.
>
> The other possibility is to make a change of variable. Define
>
> u = x - y
>
> Then the iterated integral can be converted to the form
>
> integral, 0 to inf, integral, 0 to k, f(u+y)*g(y) du dy
>
> (Note that the Jacobian in this case is just 1.) This is in a form that can be
> used by 'dblquad' since the inner limits are independent of y values. This is
> the method you should use if you know only the two probability density
> functions.
>
> Note that each of these forms requires a separate numerical integration
for
> each desired value of k. There is always the possibility that, depending on
> the functions, f(x) and g(y), there may exist an analytic solution to these
> integrals using the 'int' function of the Symbolic Toolbox, but the odds are
> heavily against this I would think.
>
> Roger Stafford
-------
  I should have warned you that matlab's numerical quadrature functions
cannot actually accept infinite limits. in your problem you will have to use a
finite limit that is sufficiently large to encompass all the areas with significant
probability densities, but not so large as to confuse the routine into taking
too few samples of the integrands in the critical areas to attain the needed
accuracy. These routines don't seem to be particularly robust in this regard.
You'll have to experiment with them a bit.

Roger Stafford

Subject: Re: Calculating cumulative probability

From: NZTideMan

Date: 14 Feb, 2008 22:50:14

Message: 5 of 7

On Feb 15, 10:53=A0am, "Roger Stafford"
<ellieandrogerxy...@mindspring.com.invalid> wrote:
> "Roger Stafford" <ellieandrogerxy...@mindspring.com.invalid> wrote in
> message <fp24vh$49...@fred.mathworks.com>...
>
>
>
> > =A0 Since you know X and Y to be independent random variables, your desi=
red
> > probability can be expressed as the iterated integral:
>
> > =A0P(k) =3D integral, 0 to inf, (integral, y to y+k, f(x)*g(y) dx) dy
>
> > where f(x) and g(y) are the respective probability density functions of =
X and
> Y. =A0
> > That is: the integral from 0 to infinity of the integral from y to y+k o=
f the
> > product f(x)*g(y), first with respect to x and then with respect to y. =
=A0(I would
> > not have used the term "cumulative probability" to describe this particu=
lar
> > probability, however.) =A0Unfortunately, matlab's 'dblquad' is unable to=

> evaluate
> > a double integral in this form where the inner limits of integration dep=
end
> on
> > the outer variable.
>
> > =A0 I can see two possibilities for numerically calculating this using m=
atlab. =A0
> > First, if you already know the cumulative distribution of X as a functio=
n, F
> (x),
> > the above can then be directly expressed as the single integral:
>
> > =A0integral, 0 to inf, (F(y+k)-F(y))*g(y) dy
>
> > by direct substitution of F(y+k)-F(y) for the inner integral after facto=
ring
> out
> > the g(y). =A0Any of the matlab single quadrature functions can be used f=
or
> this:
> > 'quad', 'quad8', etc.
>
> > =A0 The other possibility is to make a change of variable. =A0Define
>
> > =A0u =3D x - y
>
> > Then the iterated integral can be converted to the form
>
> > =A0integral, 0 to inf, integral, 0 to k, f(u+y)*g(y) du dy
>
> > (Note that the Jacobian in this case is just 1.) =A0This is in a form th=
at can be
> > used by 'dblquad' since the inner limits are independent of y values. =
=A0This is
> > the method you should use if you know only the two probability density
> > functions.
>
> > =A0 Note that each of these forms requires a separate numerical integrat=
ion
> for
> > each desired value of k. =A0There is always the possibility that, depend=
ing on
> > the functions, f(x) and g(y), there may exist an analytic solution to th=
ese
> > integrals using the 'int' function of the Symbolic Toolbox, but the odds=
 are
> > heavily against this I would think.
>
> > Roger Stafford
>
> -------
> =A0 I should have warned you that matlab's numerical quadrature functions
> cannot actually accept infinite limits. =A0in your problem you will have t=
o use a
> finite limit that is sufficiently large to encompass all the areas with si=
gnificant
> probability densities, but not so large as to confuse the routine into tak=
ing
> too few samples of the integrands in the critical areas to attain the need=
ed
> accuracy. =A0These routines don't seem to be particularly robust in this r=
egard. =A0
> You'll have to experiment with them a bit.
>
> Roger Stafford- Hide quoted text -
>
> - Show quoted text -

Another way is to hit it with the sledge hammer: Monte Carlo
simulation.
Sample, say, a million from each distribution, then do a 2-D histogram
on the results.
Repeat this many times and use allstats (from the File Exchange) to
calculate the statistics for each bin in the histogram.
When the standard errors for each bin in the histogram reduce to an
acceptable level, you're done.

Subject: Re: Calculating cumulative probability

From: Roger Stafford

Date: 15 Feb, 2008 00:49:01

Message: 6 of 7

NZTideMan <mulgor@gmail.com> wrote in message
<53ab9842-75a9-4c46-82f2-
f752beba4735@c4g2000hsg.googlegroups.com>...
> Another way is to hit it with the sledge hammer: Monte Carlo
> simulation.
> Sample, say, a million from each distribution, then do a 2-D histogram
> on the results.
> Repeat this many times and use allstats (from the File Exchange) to
> calculate the statistics for each bin in the histogram.
> When the standard errors for each bin in the histogram reduce to an
> acceptable level, you're done.
------------
  Why on earth should Omkar go to all the trouble of generating pseudo-
random variables with the given distributions when their densities are already
known and moreover known to be continuous, as has been stated? The idea
of using a Monte Carlo method here sounds completely defeatist to me,
NZTideMan, and I regard it as poor advice!

  Given the stated continuity of the densities, even if an analytic method of
integration is not available, the number of samples of these densities which is
necessary to arrive at some given degree of accuracy with numerical
integration is bound to be far, far smaller than the number of Monte Carlo
trials that would achieve that same accuracy.

  Ask how many times one has to flip a coin to determine empirically the
probability of heads (assuming we don't already know it) to an accuracy of
one part in a million, and you will find that it is of the order of a trillion
tosses! Monte Carlo methods are only to be used when a statical situation is
not sufficient well understood for probabilities to be calculated directly.

Roger Stafford

Subject: Re: Calculating cumulative probability

From: NZTideMan

Date: 15 Feb, 2008 03:18:25

Message: 7 of 7

On Feb 15, 1:49=A0pm, "Roger Stafford"
<ellieandrogerxy...@mindspring.com.invalid> wrote:
> NZTideMan <mul...@gmail.com> wrote in message
>
> <53ab9842-75a9-4c46-82f2-
> f752beba4...@c4g2000hsg.googlegroups.com>...> Another way is to hit it wit=
h the sledge hammer: Monte Carlo
> > simulation.
> > Sample, say, a million from each distribution, then do a 2-D histogram
> > on the results.
> > Repeat this many times and use allstats (from the File Exchange) to
> > calculate the statistics for each bin in the histogram.
> > When the standard errors for each bin in the histogram reduce to an
> > acceptable level, you're done.
>
> ------------
> =A0 Why on earth should Omkar go to all the trouble of generating pseudo-
> random variables with the given distributions when their densities are alr=
eady
> known and moreover known to be continuous, as has been stated? =A0The idea=

> of using a Monte Carlo method here sounds completely defeatist to me,
> NZTideMan, and I regard it as poor advice!
>
> =A0 Given the stated continuity of the densities, even if an analytic meth=
od of
> integration is not available, the number of samples of these densities whi=
ch is
> necessary to arrive at some given degree of accuracy with numerical
> integration is bound to be far, far smaller than the number of Monte Carlo=

> trials that would achieve that same accuracy.
>
> =A0 Ask how many times one has to flip a coin to determine empirically the=

> probability of heads (assuming we don't already know it) to an accuracy of=

> one part in a million, and you will find that it is of the order of a tril=
lion
> tosses! =A0Monte Carlo methods are only to be used when a statical situati=
on is
> not sufficient well understood for probabilities to be calculated directly=
.
>
> Roger Stafford

Well, for engineering purposes, one part in a thousand is probably
more than enough, but in any case I have this wonderful machine called
a computer, with wonderful software called Matlab that simply LOVES
Monte Carlo simulations. And for an engineer like me, it's so
intuitive. But I concede that for known continuous PDFs, it's
overkill. That's why I called it a sledge-hammer approach.
Nevertheless, it is an alternative.

Tags for this Thread

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

rssFeed for this Thread

envelope graphic E-mail this page to a colleague

Public Submission Policy
NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Disclaimer prior to use.
Related Topics