Performance of random number generator

I'm trying to generate a lognormal random number that is truncated, and the only way that I've seen that I can do that is by first fitting the distribution to the data: pd = fitdist(data,'lognormal');
then truncating it: t = truncate(pd,minw,maxw);
and then creating the random numbers with: r = random(t,1,10000);
The problem I have is that I'm doing this for 100 different values of mu-sigma-pairs, and doing 10,000 simulations, and it is taking forever. I started running the code this morning (10 am) in Matlab R2014b in a server and 8 hours later it is still not done.
Before, I was doing this without truncating the values by using R = lognrnd(mu,sigma), and it only took 1.5 to 2 hours to run the exact same code.
What can I do to make it faster? Is there a way to truncate the random number while using lognrnd?
Thanks for your help

7 Comments

It would be helpful if you could post your code, or at least a distilled version that helps us understand the slow part.
Hi the cyclist
You can see a distilled version of my code below, it is self contained except for the arrays "data", "Data2" and "lambda" which I load into Matlab. "data" is a (500,1) vector of wind speeds, "Data2" is a (n,2) matrix with a linear trend that specifies how the parameters of the Wind distribution shift over time, and "lambda" is a (i,1) vector with different values of lambda for a Poisson distribution.
The performance problem is in populating the last Array "W" with the random number generator. I had to restart my code last night and it's been 14 hours and it is still running.
Using the previous version of populating "W" (last 7 lines of the code) it only used to take 1.5 to 2 hours to run the code.
I just need to find a faster way to obtain a lognormal random number that is truncated.
Thanks
Sebastian
%%Setting the random number stream seed for reproducibility of results
stream0 = RandStream('mt19937ar','Seed',1);
RandStream.setGlobalStream(stream0);
%%General Parameters
n = 96; % Time period
runs = 10000; % Number of simulations
minw = 34; % Minimum
maxw = 220; % Maximum
gamma = 0.035;
i = 20;
% S array from Poisson distribution
S = zeros(n,1,i);
for j = 1:i;
S(:,1,j) = poissrnd(lambda(j,1),n,1);
end;
S = repmat(S,1,runs);
% Lognormal fitting
% mu = col1 | sigma = col2
pd = fitdist(data,'lognormal');
[mp, np] = size(pd.Params);
Wparams = zeros(n,np);
Wparams(1,:) = pd.Params;
for j = 2:n;
Wparams(j,1) = Wparams(1,1)*(1+gamma*Data2(j,2));
Wparams(j,2) = Wparams(1,2)*(1+gamma*Data2(j,2));
end;
% Truncate the W Distribution
t = truncate(pd,minw,maxw);
%Array for W
Ms = max(max(max(S)));
W = zeros(n,Ms,runs,i); %W
%THIS IS THE PROBLEM
for z = 2:n;
for j = 1:runs;
for l = 1:i;
t.mu = Wparams(z,1);
t.sigma = Wparams(z,2);
W(z,1:S(z,j,l),j,l) = random(t,1,S(z,j,l));
end;
end;
end;
%This is what I used before (without truncation and it was way faster)
for z = 2:n;
for j = 1:runs;
for l = 1:i;
W(z,1:S(z,j,l),j,l) = lognrnd(Wparams(z,1),Wparams(z,2),1,Storms(z,j,l));
end;
end;
end
What do you mean by truncation? Are you simply trying to toss out all random draws that are, say, above the 95th percentile or below the 5th percentile?
Or are you trying to truncate the numerical values themselves, like truncating 2.41223 to 2.4?
Either way this should be pretty easy to accomplish after making your random draws from the lognormal distribution.
R=lognrnd(mu,sigma,1,10000);
Hi Kirby,
What I'm trying to do is have a lognormal random number that falls between 34 and 220. I agree that is easy to replace any number that is bigger than 220 by 220 by doing something like
W(W > maxw) = maxw;
The problem is that then results in too many outcomes with the value 220, which is not what I want. I could also replace them with zero but that would also result in a lower mean than expected which is also not what I need.
The correct way is to truncate the distribution a priori so that any draw can only fall within the limit. The problem is that this approach is too slow.
Any suggestions?
Thanks
Sebastian
Instead of assigning a value to numbers outside of your desired range, redraw for those.
replaceIdx = (W > maxw)|(W < minw);
while sum(replaceIdx)>0,
W(replaceIdx) = lognrnd(mu,sigma,1,sum(replaceIdx));
replaceIdx = (W > maxw)|(W < minw);
end
Another solution is to draw way more than you need and keep a random subset that is inside the truncation range.
Hi Kirby,
Thank for this suggestion, I think this might work, I'm gonna give it a try. :)
Thanks
Sebastian
Kirby,
Thanks a lot, your suggestion worked very well, I just needed to tweak it a bit for my 4-D array.
Thanks for the help
Sebastian

Sign in to comment.

Answers (1)

total_samples_needed = 10000;
have_samples = [];
while true
num_needed_now = total_samples_needed - length(have_samples);
if num_needed_now <= 0; break; end
current_samples = RANDOMGENERATOR(1,num_needed_now);
current_samples(current_samples < Lower_Bound | current_samples > Upper_bound) = [];
have_samples = horzcat(have_samples, current_samples);
end
If you want to make it more efficient, you can ask it to generate 1.1 (or as appropriate) times num_needed_now and at the end discard any unneeded ones you generated. An appropriate multiplication factor would be 1 divided by the cdf between Lower_Bound and Upper_Bound.
RANDOMGENERATOR would be replaced by the appropriate call for your purposes.

2 Comments

Hi Walter,
Thanks for taking the time to help me. I think what you are suggesting would work in some other cases, but in my case the parameters mu and sigma of the distribution change over time so I have to adjust for that. Bu you've given me some idea of how to move forward.
Thanks
Sebastian
Just have the current mu and sigma passed to RANDOMGENERATOR . As long as they do not have to change within a run of total_samples_needed there is no problem.

Sign in to comment.

Categories

Find more on Random Number Generation in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!