The geometric distribution models the number of failures before one success in a series of independent trials, where each trial results in either success or failure, and the probability of success in any individual trial is constant. For example, if you toss a coin, the geometric distribution models the number of tails observed before getting a heads. The geometric distribution is discrete, existing only on the nonnegative integers.

The geometric distribution uses the following parameter.

Parameter | Description |
---|---|

$$0\le p\le 1$$ | Probability of success |

The probability distribution function (pdf) of the geometric distribution is

$$y=f(x|p)=p{(1-p)}^{x}\text{\hspace{1em}};\text{\hspace{1em}}x=0,1,2,\dots \text{\hspace{0.17em}},$$

where *p* is the probability of success, and *x* is
the number of failures before the first success. The result *y* is
the probability of observing exactly *x* trials before
a success, when the probability of success in any given trial is *p*.
For discrete distributions, the probability distribution function
is also known as the probability mass function (pmf).

This plot shows how changing the value of the probability parameter *p* alters
the shape of the pdf. Use `geopdf`

to
compute the pdf for values at *x* equals 1 through
10, for three different values of *p*. Then plot
all three pdfs on the same figure for a visual comparison.

x = [1:10]; y1 = geopdf(x,0.1); % For p = 0.1 y2 = geopdf(x,0.25); % For p = 0.25 y3 = geopdf(x,0.75); % For p = 0.75 figure; plot(x,y1,'kd') hold on plot(x,y2,'ro') plot(x,y3,'b+') legend({'p = 0.1','p = 0.25','p = 0.75'}) hold off

In this plot, the value of *y* is the probability
of observing exactly *x* trials before a success.
When the probability of success *p* is large, *y* decreases
rapidly as *x* increases, and the probability of
observing a large number of failures before a success quickly becomes
small. But when the probability of success *p* is
small, *y* decreases slowly as *x* increases.
The probability of observing a large number of failures before a success
still decreases as the number of trials increases, but at a much slower
rate.

A random number generated from a geometric distribution represents
the number of failures observed before a success in a single experiment,
given the probability of success *p* for each independent
trial. Use `geornd`

to generate
random numbers from the geometric distribution. For example, the following
generates a random number from a geometric distribution with probability
of success *p* equal to 0.1.

p = 0.1; r = geornd(p)

r = 1

The returned random number represents the number of failures observed before a success in a series of independent trials.

The geometric distribution is a special case of the negative
binomial distribution, with the specified number of successes
parameter *r* equal to 1.

The cumulative distribution function (cdf) of the geometric distribution is

$$y=F(x|p)=1-{\left(1-p\right)}^{x+1}\text{\hspace{0.17em}};\text{\hspace{0.17em}}x=0,1,2,\mathrm{...}\text{\hspace{0.17em}},$$

where *p* is the probability of success, and *x* is
the number of failures before the first success. The result *y* is
the probability of observing up to *x* trials before
a success, when the probability of success in any given trial is *p*.

This plot shows how changing the value of the parameter *p* alters
the shape of the cdf. Use `geocdf`

to
compute the cdf values at *x* equals 1 through 10,
for three different values of *p*. Then plot all
three cdfs on the same figure for a visual comparison.

x = [1:10]; y1 = geocdf(x,0.1); % For p = 0.1 y2 = geocdf(x,0.25); % For p = 0.25 y3 = geocdf(x,0.75); % For p = 0.75 figure; plot(x,y1,'kd') hold on plot(x,y2,'ro') plot(x,y3,'b+') legend({'p = 0.1','p = 0.25','p = 0.75'}) hold off

In this plot, the value of *y* is the probability
of observing up to *x* trials before a success. When
the probability of success *p* is large, *y* increases
rapidly as *x* increases. The probability of observing
a success quickly becomes very high, even for a small number of trials.
But when the probability of success *p* is small, *y* increases
slowly as *x* increases. The probability of observing
a success still increases as the number of trials increases, but at
a much slower rate.

The inverse cdf of a geometric distribution determines the value
of *x* that corresponds to a probability *y* of
observing *x* successes in a row in independent trials.
Use `geoinv`

to compute the inverse
cdf of the geometric distribution. For example, the following returns
the smallest possible integer *x* such that the geometric
cdf *y* evaluated at *x* is greater
than or equal to 0.1 , when the probability of success for each independent
trial *p* is 0.03.

y = 0.1; p = 0.03; x = geoinv(y,p)

x = 3

The mean of the geometric distribution is

$$\text{mean}=\frac{1-p}{p}\text{\hspace{0.17em}},$$

and the variance of the geometric distribution is

$$\mathrm{var}=\frac{1-p}{{p}^{2}}\text{\hspace{0.17em}},$$

where *p* is the probability of success.

Use `geostat`

to compute
the mean and variance of a geometric distribution. For example, the
following computes the mean *m* and variance *v* of
a geometric distribution with probability parameter *p* equal
to 0.25.

p = 0.25; [m,v] = geostat(p)

m = 3 v = 12

Suppose the probability of a five-year-old car battery not starting in cold weather is 0.03. What is the probability of the car starting for 25 consecutive days during a long cold snap?

Model the scenario using a geometric distribution. In this case, the "failure" event is the car starting, and the "success" event is the car not starting. We want to determine the probability of observing 25 failures (the car starting) without observing a single success (the car not starting). The probability of success for each trial (the car not starting in any single attempt) is `P = 0.03`

.

To solve, first compute the cumulative distribution function (cdf) for `x = 25`

trials. This returns the probability of observing success (the car not starting) in up to 25 trials. Then subtract this result from `1`

to determine the probability of
observing success in up to 25 trials - in other words, the probability that the car starts at every one of the 25 attempts.

pstart = 1 - geocdf(25,0.03)

pstart = 0.4530

The returned result `pstart = 0.4530`

is the probability that the car will start every day for 25 days in a row during a cold snap.

This plot of the cdf for this scenario shows that, as the number of trials (`x`

) increases, the probability of success (`y`

) also increases. In the context of this example, it means that the more times you attempt to start the car, the greater the probability that it does not start on at least one of those occasions.

figure; x = 0:25; y = geocdf(x,0.03); stairs(x,y)

Was this topic helpful?