File Exchange

## Random Sample from Discrete PDF

version 1.0.0.0 (2.07 KB) by Joshua Stough

### Joshua Stough (view profile)

Inverse transform sampling to generate random sample from pdf given by domain x and p(x).

Updated 01 Aug 2012

Generate random samples given a pdf. The x and p(x) are specified in the
arguments. See inverse transform sampling, gaussdis, gammadis.

Examples:
X = pdfrnd(0:.1:20, gammadis(3,2, 0:.1:20), 100000);
figure, hist(X, 100);
%X will be 100000 samples from gamma distribution with a/k = 3, b/Theta =
%2.

X = pdfrnd(-10:.1:10, gaussdis(2, 3, -10:.1:10), 100000);
figure, hist(X, 100);
%X will be 100000 samples from a gaussian distribution with mean 2, var 3.
%Should be roughly equivalent to X = sqrt(3)*randn(100000, 1) + 2;

### Cite As

Joshua Stough (2020). Random Sample from Discrete PDF (https://www.mathworks.com/matlabcentral/fileexchange/37698-random-sample-from-discrete-pdf), MATLAB Central File Exchange. Retrieved .

Are Mjaavatten

### Are Mjaavatten (view profile)

The cdf(1)>0 problem noted by Chavas can lead to significant errors in some cases. Try e.g,:
x = 0:.1:5; X = pdfrnd(x, gammadis(1,1, x), 1e5); hist(X,100);
Here all random samples less that cdf(1) = 0.0957 give X = 0.
I recommend instead the File Exhange entry RANDRAW by Alex Bar-Guy. For comparison, try:
X = randraw('gamma',[0,1,1],1e5);hist(X(X<5),100);

Martin Eisemann

### Martin Eisemann (view profile)

Maybe I am mistaken here, so please correct me if I am wrong, but wouldn't this help?

cdf = cumsum(px);
cdf = cdf - min(cdf);
cdf = cdf/max(cdf);

rnd = rand(sampleSize, 1);
X = interp1(cdf, x, rnd, 'linear', 0);

Joshua Stough

### Joshua Stough (view profile)

Response to Chavas:
Thank you so much for your comment. The problem as I see it is that with the assumption of 0, it is possible to generate samples from outside of the provided domain x. Even for the examples I provide, cdf(1) > 0 and so invalid samples can be generated. However, your proposed solution could also generate such samples. Perhaps to account for binning in x, x(1) - dx/2 should be possible as a minimum, and x(end) + dx/2 as a max (assuming equal sampling of the domain). The other option is to assume the user provides precisely the x(1) and x(end) they want to sample within. I will post new code soon. Thank you again.

Dan Chavas

### Dan Chavas (view profile)

Whoops, there is a problem: if you input a pdf (px) such that px(1)>0, then cdf(1)>0. However it is possible for rnd < cdf(1) ; in such a case, interp1 returns NaN, which you've forced to be 0. This is wrong.

Consider adding this code to ensure the cdf begins at 0:

%%Account for non-zero probability at start
cdf = cumsum(px);
if(cdf(1)>0)
cdf = [0 cdf];
dx = x(2) - x(1);
x0 = x(1) - dx;
x = [x0 x];
end

rnd = rand(sampleSize, 1);

X = interp1(cdf, x, rnd, 'linear');

Dan Chavas

### Dan Chavas (view profile)

Just wrote this code myself to apply specifically to a power-law distribution. Then I found this file for arbitrary pdfs, tried it out and they match! Simple but very useful, this should be included as a standard MATLAB function.

##### MATLAB Release Compatibility
Created with R2010a
Compatible with any release
##### Platform Compatibility
Windows macOS Linux