Documentation |
On this page… |
---|
The probability density function for the generalized Pareto distribution with shape parameter k ≠ 0, scale parameter σ, and threshold parameter θ, is
$$y\text{}\text{}\text{\hspace{0.17em}}\text{\hspace{0.17em}}=\text{}\text{\hspace{0.17em}}f(x|k,\sigma ,\theta )=\text{}\text{}\text{}\text{}\text{}\text{}\text{\hspace{0.17em}}\left(\frac{1}{\sigma}\right){\left(1+k\frac{(x-\theta )}{\sigma}\right)}^{-1-\frac{1}{k}}$$
for θ < x, when k > 0, or for θ < x < θ – σ/k when k < 0.
For k = 0, the density is
$$y\text{}\text{}\text{\hspace{0.17em}}\text{\hspace{0.17em}}=\text{}\text{\hspace{0.17em}}f(x|0\text{},\sigma ,\theta )=\text{}\text{}\text{}\text{}\text{}\text{}\text{\hspace{0.17em}}\left(\frac{1}{\sigma}\right){e}^{}$$
for θ < x.
If k = 0 and θ = 0, the generalized Pareto distribution is equivalent to the exponential distribution. If k > 0 and θ = σ/k, the generalized Pareto distribution is equivalent to the Pareto distribution.
Like the exponential distribution, the generalized Pareto distribution is often used to model the tails of another distribution. For example, you might have washers from a manufacturing process. If random influences in the process lead to differences in the sizes of the washers, a standard probability distribution, such as the normal, could be used to model those sizes. However, while the normal distribution might be a good model near its mode, it might not be a good fit to real data in the tails and a more complex model might be needed to describe the full range of the data. On the other hand, only recording the sizes of washers larger (or smaller) than a certain threshold means you can fit a separate model to those tail data, which are known as exceedences. You can use the generalized Pareto distribution in this way, to provide a good fit to extremes of complicated data.
The generalized Pareto distribution allows a continuous range of possible shapes that includes both the exponential and Pareto distributions as special cases. You can use either of those distributions to model a particular dataset of exceedences. The generalized Pareto distribution allows you to "let the data decide" which distribution is appropriate.
The generalized Pareto distribution has three basic forms, each corresponding to a limiting distribution of exceedence data from a different class of underlying distributions.
Distributions whose tails decrease exponentially, such as the normal, lead to a generalized Pareto shape parameter of zero.
Distributions whose tails decrease as a polynomial, such as Student's t, lead to a positive shape parameter.
Distributions whose tails are finite, such as the beta, lead to a negative shape parameter.
The generalized Pareto distribution is used in the tails of distribution fit objects of the paretotails class.
If you generate a large number of random values from a Student's t distribution with 5 degrees of freedom, and then discard everything less than 2, you can fit a generalized Pareto distribution to those exceedences.
rng default % For reproducibility t = trnd(5,5000,1); y = t(t > 2) - 2; paramEsts = gpfit(y)
paramEsts = 0.1445 0.7225
Notice that the shape parameter estimate (the first element) is positive, which is what you would expect based on exceedences from a Student's t distribution.
hist(y+2,2.25:.5:11.75); h = findobj(gca,'Type','patch'); h.FaceColor = [.8 .8 1]; xgrid = linspace(2,12,1000); line(xgrid,.5*length(y)*... gppdf(xgrid,paramEsts(1),paramEsts(2),2));
Compute the pdf of three generalized Pareto distributions. The first has shape parameter k = -0.25, the second has k = 0, and the third has k = 1.
x = linspace(0,10,1000); y1 = gppdf(x,-.25,1,0); y2 = gppdf(x,0,1,0); y3 = gppdf(x,1,1,0);
Plot the three pdfs on the same figure.
figure; plot(x,y1,'-', x,y2,'--', x,y3,':') legend({'K < 0' 'K = 0' 'K > 0'});