how to generate random numbers with constraints?

Question

alex brown on 29 Apr 2019

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/459104-how-to-generate-random-numbers-with-constraints

Commented: alex brown on 30 Apr 2019

Hi, I want to generate 24 random numbers whose sum is limited between [3550 3650]. in addition each number must be between [5 800].

here is my try:

ub=800*ones(1,24);
lb=5*ones(1,24);
a=lb+rand(1,24).*(ub-lb);
while sum(a)>3650 || sum(a)<3550
    a=lb+rand(1,24).*(ub-lb);
end

but it stucks in the loop forever. please help me...

6 Comments
Show 4 older commentsHide 4 older comments

John D'Errico on 29 Apr 2019

The data that I generated IS uniform. It is uniform over the domain of the triangular region where the data lives. It is uniform over the set that satifies the sum constraint.

The marginal distribution has a triangular PDF. It is NOT a gamma. There is no way that you can have data that is BOTH uniform over the constraint domain, as well as being uniform in any of the individual variables.

If you are looking for data that is truly uniform in each of the variables, yet has a sum that falls within some bounds? You can't have that.

Consider the example I gave. I chose two variables, x and y, uniform over the interval [0,1]. Now, look at the sum x+y. Can the variables be truly independently uniform over the domain, yet have a sum that is less than 1? So I'm not sure what you are asking to produce anymore. What is the final goal here?

alex brown on 30 Apr 2019

I tested so many conditioned random numbers. I think you are right.

In fact when we choose the first number we limit the second number interval ,etc. Therfore the next number distribution can't be uniform

thank you

Sign in to comment.

Sign in to answer this question.

Answer 1

John D'Errico on 29 Apr 2019

2
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/459104-how-to-generate-random-numbers-with-constraints#answer_372739

Edited: John D'Errico on 29 Apr 2019

Open in MATLAB Online

This is not simple, if you want to find a truly uniform set. In 24 dimensions, things get fairly complicated. And the domain the numbers can live in is actually pretty wide, thus the interval [5,800].

So, if you have all numbers at or near their mininum,

5*24
ans =
   120
>> 800*24
ans =
       19200
>> (5 + 800)/2*24
ans =
        9660

The point being that it will be a quite rare event that 24 numbers, chosen randomly from that interval, will have a sum in the interval [3550,3650].

The law of large numbers tells us that the sum of that set will be effectively normally distributed. And, in fact, this gives us a solution to the problem!

Lets see. Each member of that set will be uniform, on the interval [5,800]. That means the mean of x_i will be

a = 5;
b = 800;
(a + b)/2
ans =
        402.5

And the variance (of each number in the set) will be

(b-a)^2/12
ans =
     52668.75

So now the sum of 24 of them will be fairly accurately normally distributed, with mean, variance, and standard deviation:

sumMean = (a+b)/2*24
xmean =
        9660
sumVar = (b-a)^2/12*24
xvar =
     1264050
sumStd = sqrt(sumVar)
xstd =
          1124.29978208661

So, now what are the odds that you will find a sum in the range from [3550,3650]? That would be found from a cumulative normal.

sumLimits = [3550,3650];
normcdf(sumLimits,sumMean,sumStd)
ans =
      2.74761306010529e-08      4.50716095152589e-08
      
diff(ans)
ans =
       1.7595478914206e-08

Hmm. So you will probably need to generate roughly 1e8 sets of 24 uniform numbers before you find a single set that lives in the interval [3550,3650].

But! As I said, this actually gives you a near trivial solution! How? We break the problem down into two simpler problems. Hard problems are hard. Easy problems are way easier. So lets create two simpler problems.

First, can we sample from a truncated normal distributiuon? That part is easy.

norminv(sumNormProbLimits(1) + rand()*diff(sumNormProbLimits),sumMean,sumStd)
ans =
          3637.00633112408

As you can see, that is a number that has atuncated normal distribution, with the desired parameters. So now we can just use randfixedsum. I'll put it all together here:

a = 5;
b = 800;
ndim = 24;
sumMean = (a+b)/2*ndim;
sumVar = (b-a)^2/12*ndim;
sumStd = sqrt(sumVar);
sumLimits = [3550,3650];
sumNormProbLimits = normcdf(sumLimits,sumMean,sumStd);
S = norminv(sumNormProbLimits(1) + rand()*diff(sumNormProbLimits),sumMean,sumStd);
randfixedsum(24,1,S,a,b)
ans =
          32.6666351919248
          226.171574450247
          26.4486244673284
          59.5273902733376
          105.140425145937
          198.085619814426
          313.588691531256
          125.527412419838
          222.108869160278
          78.6355991625287
          116.832965448386
          25.2315920517267
          10.7762636314948
          295.321051677379
          339.225232904641
          91.7004913938697
          29.5858150543207
          184.827862803465
          37.7525635267701
          115.866539562388
          512.657889835195
          372.544944227602
          13.6087483202661
           58.868132270981

So, a set of 24 uniformly distributed random numbers that have their sum in the desired interval. The numbers really are uniformly distributed to lie in the interval [5,800], but the condition that the sum be so relatively small, makes it unlikely that any of them are near the maximum. On this random set, the max was as large as 512. That seems reasonable to me, under the conditional sampling that was required.

sum(ans)
ans =
          3592.70093432559

It was really pretty easy to do. I did need randfixedsum, which you can get from the File Exchange for free download.

https://www.mathworks.com/matlabcentral/fileexchange/9700-random-vectors-with-fixed-sum

2 Comments
Show NoneHide None

alex brown on 29 Apr 2019

thank you for your time John D'Errico, it was really helpful. I had doubt about randfixedsum because I didn't know about the solutions distribution. I need the answers to be uniformly distributed. I have to run the code for 100 times and analyze the distribution of the 24*100 answers to find out the distribution of them.

John D'Errico on 29 Apr 2019

Ok, then you are a little better off using my solution, though the difference is not that huge.

For example, in my solution, you should expect to see more samples with a sum near 3650 than near 3550. The differential (based on the PDF at those points) should be not quite twice as often. In Rik's solution, the two ends of that interval on the sum will be equally likely. Is this s huge difference? Perhaps not. It very much depends on what you are doing with the results. Since you are actually testing things to look at the distributions, you will prefer the more technically correct solution.

A very important thing to note however, is that a sample size of only 100 such sets is tiny. TINY. MINISCULE. In fact, when you are looking at things like this, I would be looking at millions of such samples at the least.

Sign in to comment.

Answer 2

Rik on 29 Apr 2019

2
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/459104-how-to-generate-random-numbers-with-constraints#answer_372738

Edited: Rik on 29 Apr 2019

Open in MATLAB Online

The reason for your infinite loop is that you need an average random value of 150, which is far from the expected value of 402.5. That means you need quite an extreme situation to land on your required vector.

You can use randfixedsum from the FEX. This function takes a more mathematical approach, which means that it doesn't need to reject vectors that don't satisfy the constraints.

%output matrix size
n=24;m=1;
%generate a random sum between [3550 3650]
s_bounds=[3550 3650];s=rand*diff(s_bounds)+s_bounds(1);
%bounds of values themselves
a=5;b=800;
%generate the values
x = randfixedsum(n,m,s,a,b);
%transpose to move from 24x1 to 1x24
x=x';

6 Comments
Show 4 older commentsHide 4 older comments

Rik on 29 Apr 2019

Thank you for clarifying. Statistics aren't my strong suit, so questions like this reach my limit.

alex brown on 29 Apr 2019

Thank you Rik for the answer. Now I am looking for the distribution of the 24 answers generated by the randfixedsum. I think its distribution is not uniform. And it is a problem.

Sign in to comment.

how to generate random numbers with constraints?

6 Comments
Show 4 older commentsHide 4 older comments

Accepted Answer

2 Comments
Show NoneHide None

More Answers (1)

6 Comments
Show 4 older commentsHide 4 older comments

See Also

Categories

Tags

Community Treasure Hunt

how to generate random numbers with constraints?

6 Comments Show 4 older commentsHide 4 older comments

Accepted Answer

2 Comments Show NoneHide None

More Answers (1)

6 Comments Show 4 older commentsHide 4 older comments

See Also

Categories

Tags

Community Treasure Hunt

6 Comments
Show 4 older commentsHide 4 older comments

2 Comments
Show NoneHide None

6 Comments
Show 4 older commentsHide 4 older comments