Probability Distributions

What Is a Probability Distribution?

Probability distributions are mathematical models that assign probability to a random variable. They can be used to model experimental or historical data in order to generate prediction estimates or analyze a large number of outcomes such as in Monte Carlo simulations.

There are two main types of probability distributions: parametric and nonparametric.

Parametric distributions are probability distributions that can be described using an equation with a finite set of parameters. For a specified parametric distribution, the parameters are estimated by fitting to data. Some common parametric distributions include:

  • Normal (or Gaussian) distribution
  • Weibull distribution
  • Generalized extreme value (GEV) distribution
  • Logistic distribution
  • Kernel distribution
  • Copulas (multivariate distributions)

Nonparametric distributions are probability distributions that provide estimates of probability density functions based purely on sample data. This is preferred when data cannot be accurately described by parametric distributions. Some common nonparametric probability distributions include:

  • Kernel distribution
  • Empirical cumulative distribution
  • Piecewise linear distribution
  • Piecewise distribution with Pareto tails
  • Triangular distribution

Parametric distributions can be easily fit to data using maximum likelihood estimation. The fitted distributions are then used to perform further analyses by computing summary statistics, evaluating the probability density function (PDF) and cumulative distribution function (CDF), and assessing the fit of the distribution to your data.

For more information on types of distributions, distribution fitting, visualizing distributions, and generating random numbers, see Statistics and Machine Learning Toolbox™ for use with MATLAB®.

See also: Statistics and Machine Learning Toolbox, machine learning, random number, data fitting, data analysis, mathematical modeling, MANOVA