The probability density function for the generalized Pareto
distribution with shape parameter *k* ≠ *0*,
scale parameter *σ*, and threshold parameter *θ*,
is

$$y\text{}\text{}\text{\hspace{0.17em}}\text{\hspace{0.17em}}=\text{}\text{\hspace{0.17em}}f(x|k,\sigma ,\theta )=\text{}\text{}\text{}\text{}\text{}\text{}\text{\hspace{0.17em}}\left(\frac{1}{\sigma}\right){\left(1+k\frac{(x-\theta )}{\sigma}\right)}^{-1-\frac{1}{k}}$$

for *θ* < *x*, when *k* >
0, or for *θ* < *x* < *θ* – *σ*/*k* when *k* <
0.

For *k* = 0, the density is

$$y\text{}\text{}\text{\hspace{0.17em}}\text{\hspace{0.17em}}=\text{}\text{\hspace{0.17em}}f(x|0\text{},\sigma ,\theta )=\text{}\text{}\text{}\text{}\text{}\text{}\text{\hspace{0.17em}}\left(\frac{1}{\sigma}\right){e}^{}$$

for *θ* < *x*.

If *k* = 0 and *θ* =
0, the generalized Pareto distribution is equivalent to the exponential
distribution. If *k* > 0 and *θ* = *σ*/*k*,
the generalized Pareto distribution is equivalent to the Pareto distribution
with a scale parameter equal to *σ*/*k* and
a shape parameter equal to 1/*k*.

Like the exponential distribution, the generalized Pareto distribution
is often used to model the tails of another distribution. For example,
you might have washers from a manufacturing process. If random influences
in the process lead to differences in the sizes of the washers, a
standard probability distribution, such as the normal, could be used
to model those sizes. However, while the normal distribution might
be a good model near its mode, it might not be a good fit to real
data in the tails and a more complex model might be needed to describe
the full range of the data. On the other hand, only recording the
sizes of washers larger (or smaller) than a certain threshold means
you can fit a separate model to those tail data, which are known as *exceedences*.
You can use the generalized Pareto distribution in this way, to provide
a good fit to extremes of complicated data.

The generalized Pareto distribution allows a continuous range of possible shapes that includes both the exponential and Pareto distributions as special cases. You can use either of those distributions to model a particular dataset of exceedences. The generalized Pareto distribution allows you to "let the data decide" which distribution is appropriate.

The generalized Pareto distribution has three basic forms, each corresponding to a limiting distribution of exceedence data from a different class of underlying distributions.

Distributions whose tails decrease exponentially, such as the normal, lead to a generalized Pareto shape parameter of zero.

Distributions whose tails decrease as a polynomial, such as Student's

*t*, lead to a positive shape parameter.Distributions whose tails are finite, such as the beta, lead to a negative shape parameter.

The generalized Pareto distribution is used in the tails of
distribution fit objects of the `paretotails`

class.

If you generate a large number of random values from a Student's *t* distribution
with 5 degrees of freedom, and then discard everything less than 2,
you can fit a generalized Pareto distribution to those exceedences.

rng default % For reproducibility t = trnd(5,5000,1); y = t(t > 2) - 2; paramEsts = gpfit(y)

paramEsts = 0.1445 0.7225

Notice that the shape parameter estimate (the first element)
is positive, which is what you would expect based on exceedences from
a Student's *t* distribution.

hist(y+2,2.25:.5:11.75); h = findobj(gca,'Type','patch'); h.FaceColor = [.8 .8 1]; xgrid = linspace(2,12,1000); line(xgrid,.5*length(y)*... gppdf(xgrid,paramEsts(1),paramEsts(2),2));

Compute the pdf of three generalized Pareto distributions. The first has shape parameter `k = -0.25`

, the second has `k = 0`

, and the third has `k = 1`

.

x = linspace(0,10,1000); y1 = gppdf(x,-.25,1,0); y2 = gppdf(x,0,1,0); y3 = gppdf(x,1,1,0);

Plot the three pdfs on the same figure.

figure; plot(x,y1,'-', x,y2,'--', x,y3,':') legend({'K < 0' 'K = 0' 'K > 0'});

Was this topic helpful?