version 1.1.0.0 (2.25 KB) by
Dahua Lin

The function is to draw samples from an arbitrary discrete distribution.

There are a lot of cases that you might need to sample from a discrete distribution in Monte Carlo simulations. Here are some typical examples that this function may help

(1) you want to sample from a discrete distribution over finitely many categories(labels). However, the pmf is not easy to directly sample from. Then you can just compute p(k) for each category, and then use this function to do the sampling.

(2) For a general (possibly non-parametric) continuous distribution defined in a 1D/2D space, you may discretized the sample space into small regions, and compute the probability mass for these regions, and then use this function to do the sampling.

(3) In many models, the probability distribution is expressed as a weighted sum of several "modes". To sample from this kind of distributions, you may need to first choose which mode to generate the sample according to their weights, then this function may fit in.

In sum, this function can be used directly in simple sampling, and may also be used as a building block of complicated simulation procedure.

The use of this function is easy:

x = discretesample(p, n).

You just input the probability mass, and tell the function how many sample you want to sample, then it returns the samples in form of a 1 x n vector.

Dahua Lin (2021). Sampling from a discrete distribution (https://www.mathworks.com/matlabcentral/fileexchange/21912-sampling-from-a-discrete-distribution), MATLAB Central File Exchange. Retrieved .

Created with
R2008a

Compatible with any release

**Inspired:**
Fast Sampling From A Discrete Distribution, Lynx MATLAB Toolbox

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!Create scripts with code, output, and formatted text in a single executable document.

kammoLuis PelaezAlexander SoderlundPoulomi GanguliHello, I have quantiles from a discrete distribution in an increasing order, q = [0.005, 0.01, 0.05, 0.1, 0.2, 0.167, 0.5, 0.833, 0.95, 0.99, 0.995, 0.999], I want to generate continous version of this distribution from the given quantile values. Will it be possible from this function? I couldn't locate any example given in overview [2], "For a general (possibly non-parametric) continuous distribution defined in a 1D/2D space, you may discretized the sample space into small regions, and compute the probability mass for these regions, and then use this function to do the sampling", which seems very similar to my case.

Rani BarbaraHi,

I have a joint probability distribution of 4 variables. and I want to sample from it using this function. when I pass my 4D array to the function it seems it collapses the 4D array into a one long vector. I get the indexes of the sampled but I am unable to figure out how to take the indexes back to the 4D form to know which combination is sampled.

alternatively I thought to sample the marginals and each 4 samples of the marginals will be one single sample from the joint. is the latter equivalent? or maybe you have some other advise.

Thank you

Narayanan RengaswamyFMTo further Mo Chen's approach: histogram(discretize(rand(1,n),[0;cumsum(p(:))/sum(p)]))

KevinExcellent product. Thank you.

JustinVery helpful! Thx!

ChristosVery useful, saves a lot of time when it comes in large scale simulation problems.

Woojae KimMo Chen, that is brilliant! That should be the fastest way to get the job done in MATLAB. Thanks!

Thomas ClarkVery useful, a good utility function which saves some though when implementing MC models.

Reza FathzadehMo Chenyour function seems to complicate the problem a little bit.

The following line is enough to do the jod

[~,x] = histc(rand(1,n),[0;cumsum(p(:))/sum(p)]);

Carlos BaizDahua LinJos,

I have checked with randp. Though they seem offering similar functionalities, however, the efficiency is drastically different, especially in very large scale monte carlo simulation, say you need to draw thousands or millions of samples from a distribution over thousands or even millions of states, which is not unusual in real engineering applications.

With randp, it would be incur obvious latency when you want to draw thousands of samples from thousands of states, or even run out of memory (thus resulting an empty matrix), as the algorithm implemented by randp is of complexity O(k n), where k is the number of states in the sample space. However, even million-state-level sampling can be accomplished by this function within milliseconds, as its complexity is only O(n logk).

Hope my explanation can clarify the differences between these two files.

Jos (10584)if you'd searched before submitting you would have found RANDP on the FEX. What does this function add?

Dahua LinThere's a bug in the initial submission, which has been fixed, and the new version has been uploaded, which may come out in a day.