This function extracts a scalar/vector/matrix of random numbers with discrete Probability Distribution Function.
The PDF is specified by the user as a input vector.
This function is designed to be fast, and it is implemented within a .mex file
Following Olivier B. comments (that I acknowledge for his comments), I performed cross-comparisons with randp. gDiscrPdfRnd is faster with a ratio that increases with the number of number, i.e. for about 3 times faster for 10^6 numbers to over 40 times faster for 10^7 numbers.
Moreover, for large random arrays, randp seriously surcharges the RAM memory, whereas gDiscrPdfRnd limits thememory use to what is essential (tanksto the coding). In what follows the details of thecomparison are given.
>> tic;R = randp([1 3 2],1000000,1);toc
>> tic;R = gDiscrPdfRnd([1 3 2],1000000,1);toc
>> tic;R = randp([1 3 2],10000000,1);toc
>> tic;R = gDiscrPdfRnd([1 3 2],10000000,1);toc
Thanks for sharing this file. I suggest you replace the hardcoded 32767 by RAND_MAX... Otherwise it works like a charm.
I've tried this small experiment:
N = 100000; % sample size
w = 1:100; % quite steep distribution
w = w / sum(w); % distribution (pdf)
Y_std = sqrt(w .* (1 - w)); % Standard deviations associated to each probability
Y1 = randp(w, N, 1); % Try Jos function
Y1_w = histc(Y1, [1:100]); % absolute frequencies table
Y1_w = Y1_w / N; % transform into relative frequencies
Y1_w_std = sqrt(Y1_w .* (1 - Y1_w)); % Empirical standard deviations associated to each probability
Y2 = gDiscrPdfRnd(w, N, 1); % Try Gianluca Dorini's function
Y2_w = histc(Y2, [1:100]); % absolute frequencies table
Y2_w = Y2_w / N; % transform into relative frequencies
Y2_w_std = sqrt(Y2_w .* (1 - Y2_w)); % Empirical standard deviations associated to each probability
% Check if empirical standard deviations are coherent with theretical ones
plot(Y_std - Y1_w_std', '.b') % plot empirical differences for Jos function
plot(Y_std - Y2_w_std', '.r') % plot empirical differences for Gianluca Dorini's function
It looks as though both functions statistically behave very well. They are both excellent functions. Thank you.
Nice, works pretty fast. I will use it for MC simulation based on empirical distribution.
Look at RANDP for a fast more matlabish approach:
There is no c file ?