# Using a sample PDF to generate random numbers

119 views (last 30 days)
I have generated a sample probability density function using:
[PDF,x] = ksdensity(dataset);
where 'dataset' represents de-trended noise that does not follow any obvious distribution.
I can plot my PDF using the plot function:
plot(x,PDF)
and this produces a nice curve showing my probability density function.
.
I would now like to use this PDF as the basis for a random number: i.e. I want to be able to make a vector of random numbers that follow this probability density function.
.
I have been trying to do this in a brute force way using the unifrnd(a,b) function, but with no luck as yet. Is there a more elegant way to do this?

Walter Roberson on 21 Jan 2011
There have been a number of discussions in the newsgroup recently about this, but for continuous distributions.
The solution for the continuous version is to integrate the PDF to get a CDF, then find the inverse of the CDF and evaluate that at the random value.
Based upon this, I would suggest you use cumsum() to produce the discrete CDF from your discrete PDF, and then use that as your initial Y value in interp(), with the initial X value the same as the values the PDF was sampled at, and asking to interpolate at your array of rand() numbers. You will need to think a bit about what kind of interpolation is best suited to the situation.

TideMan on 21 Jan 2011
Before interpolating, you will need to run unique on the CDF to remove duplicate values.
would really appreciate if you could write down an example and showing the right way to use the code ! i have a data called 'apr13' and i need to generate random numbers from it using kernel distribution. would really help if i get the way
Tom Lane on 10 Jul 2013
If you have a relatively recently release of MATLAB with the Statistics Toolbox, check out the fitdist function. Then call the random function on the result.

Kiran Ps on 25 Jan 2019
One line solution would be:
randomNumbers = ksdensity(dataset, rand(N,1), 'function', 'icdf');

#### 1 Comment

Atmaram Muraleedharan on 1 May 2019
Thanks a lot Kiran

Tom Lane on 21 Jan 2011
A kernel smooth density is a little normal or similar curve centered at each data point, with all of them summed up. The simplest way to generate random values from this density is to randomly select one of your data point, then add a little random noise to it using the kernel and bandwidth from the kernel smooth density.
If you create the density using the fitdist function to create a probability distribution object, then use the random method for that object, the result is computed in a way that essentially is what I described in the previous paragraph.

#### 1 Comment

Walter Roberson on 21 Jan 2011
What distribution of random noise would you have to add in order to adequately match the distribution of selecting "between" data points?

Tom Lane on 23 Jan 2011
The distribution is the "kernel" you used. By default it is normal. Multiply by the bandwidth, which is the third output from ksdensity.