## Using a sample PDF to generate random numbers

on 21 Jan 2011

I have generated a sample probability density function using:

```[PDF,x] = ksdensity(dataset);
```

where 'dataset' represents de-trended noise that does not follow any obvious distribution.

I can plot my PDF using the plot function:

```plot(x,PDF)
```

and this produces a nice curve showing my probability density function.

` .`

I would now like to use this PDF as the basis for a random number: i.e. I want to be able to make a vector of random numbers that follow this probability density function.

` .`

I have been trying to do this in a brute force way using the unifrnd(a,b) function, but with no luck as yet. Is there a more elegant way to do this?

## Products

### Walter Roberson (view profile)

on 21 Jan 2011

There have been a number of discussions in the newsgroup recently about this, but for continuous distributions.

The solution for the continuous version is to integrate the PDF to get a CDF, then find the inverse of the CDF and evaluate that at the random value.

Based upon this, I would suggest you use cumsum() to produce the discrete CDF from your discrete PDF, and then use that as your initial Y value in interp(), with the initial X value the same as the values the PDF was sampled at, and asking to interpolate at your array of rand() numbers. You will need to think a bit about what kind of interpolation is best suited to the situation.

TideMan

### TideMan (view profile)

on 21 Jan 2011

Before interpolating, you will need to run unique on the CDF to remove duplicate values.

on 9 Jul 2013

would really appreciate if you could write down an example and showing the right way to use the code ! i have a data called 'apr13' and i need to generate random numbers from it using kernel distribution. would really help if i get the way

Tom Lane

### Tom Lane (view profile)

on 10 Jul 2013

If you have a relatively recently release of MATLAB with the Statistics Toolbox, check out the fitdist function. Then call the random function on the result.

### Tom Lane (view profile)

on 21 Jan 2011

A kernel smooth density is a little normal or similar curve centered at each data point, with all of them summed up. The simplest way to generate random values from this density is to randomly select one of your data point, then add a little random noise to it using the kernel and bandwidth from the kernel smooth density.

If you create the density using the fitdist function to create a probability distribution object, then use the random method for that object, the result is computed in a way that essentially is what I described in the previous paragraph.

Walter Roberson

### Walter Roberson (view profile)

on 21 Jan 2011

What distribution of random noise would you have to add in order to adequately match the distribution of selecting "between" data points?

### Tom Lane (view profile)

on 23 Jan 2011

The distribution is the "kernel" you used. By default it is normal. Multiply by the bandwidth, which is the third output from ksdensity.