Sum of the random array cannot reach the upper limit sum

3 views (last 30 days)
lb=[56 65 21 21 16 7 31 32 35 32 17 40 79 114 104 69 56 41 16 16 11 9 10 19 35 36 19 29 34 14 19];
ub=[1171 1303 440 440 330 87 675 650 720 640 350 220 2200 2100 2100 1400 1139 837 405 220 240 269 220 730 710 400 600 661 150 105.5 330];
randomArray = lb + rand(1,31).*(ub-lb);
intArray=floor(randomArray);
Pgen=sum(intArray);
I am using this coding to produce a random array.The problem is the max of the sum intARRAY is 13000 .I try run the coding multiple time the sum can only reach to 13000.Why is this happening.I would to have the sum to produce a random array that the sum can reach 18000 to 21000 since the total sum of upper limit is 21000.

Accepted Answer

John D'Errico
John D'Errico on 30 Nov 2018
Edited: John D'Errico on 30 Nov 2018
Let me try to explain. You are trying to do this:
lb=[56 65 21 21 16 7 31 32 35 32 17 40 79 114 104 69 56 41 16 16 11 9 10 19 35 36 19 29 34 14 19];
ub=[1171 1303 440 440 330 87 675 650 720 640 350 220 2200 2100 2100 1400 1139 837 405 220 240 269 220 730 710 400 600 661 150 105.5 330];
randomArray = lb + rand(1,31).*(ub-lb);
So think of it as a hyper rectangle, in 31 dimensions. You are trying to sample numbers from that essentially rectangular domain. Then you will form the sum, and you are surprised that it never seems to happen that you approach the upper limit.
sum(ub)
ans =
21842
So, it seems entirely possible that occasionally, it just might happen that the sum of those numbers is close to the sum of the upper bounds. But this is where you are failing to understand the mathematics.
In order for that sum to be near the sum of the upper bounds, what must happen? Hmm. lets try this in two dimensions, then we can try to understand what happens when we move into higher dimensions.
Suppose I asked to find the sum of two uniform random numbers, that each lie in the interval [0,1]. Pretty easy, right? What is the probability that the sum lies close to 2? More exactly, suppose I asked to know what the probability that the sum is between 1.9 and 2, so greater than 1.9?
What must happen for that event to transpire? THINK ABOUT IT! We would need the points to lie in the region of the unit square, such that x+y>=1.9. Lets try it first, and then think again.
xy = rand(1000000,2);
sum(sum(xy,2)>1.9)
ans =
4916
So, in a sample size of 1 million events, only 4916 of them sum to close to the sum of the upper bounds. And that is only in 2 dimensions. What happened? Why?
xy = rand(10000,2);
k = sum(xy,2) > 1.9;
plot(xy(:,1),xy(:,2),'b.',xy(k,1),xy(k,2),'ro')
axis square
A picture can sometimes be worth a lot. If we think about it, in order for it to transpire that the sum exceeds 1.9, we need a point to lie above the line x+y>=1.9. That is a tiny triangle in the upper corner of the unit square.
What is the area of that triangle? It is isoceles, with legs of 0.1. So it has area 0.1^2/2
0.1^2/2
ans =
0.005
And since the area of the entire square is 1, then the probability that a point will lie in the upper triangle is 0.005/1. So in a million samples, I would expect only 5000 of them to execced 1.9 as a sum. In the large sample I ran, I came pretty close to that, with 4916.
Ok, but now what happens in 31 dimensions? It gets worse. In fact, hellishly so. The curse of dimensionality hits hard here.
Suppose we asked what the odds are that the sum of 3 random numbers, all within [0,1], now lie above, say 2.9? It turns out that the relative volume for such an event gives a probability of only
0.1^3/6
ans =
0.00016667
So pretty small. Your problem has upper and lower bounds that are not all the same, so it gets a little messier. But, suppose you ask a similar question about a problem in 31 dimensions?
Given a set of points in a unit 31-dimensional hyper-cube? I'll even relax things a bit. What is the probability that the sum of 10 numbers in a 10 dimensional hyper cube exceeds 9?
sum(sum(rand(1e6,10),2)>9)
ans =
0
sum(sum(rand(1e7,10),2)>9)
ans =
2
Gosh, in a sample size of 1e7, only 2 of them fell in that corner? As it turns out, the hyper-volume of the corresponding simplex is only
1/factorial(10)
ans =
2.7557e-07
So my expectation is that, indeed, only about 2 or 3 events will happen in a sample size of 1e7.
Likewise, in 31 dimensions, that hyper-volume gets really small.
1/factorial(31)
ans =
1.2161e-34
So the probability that a sum of 31 numbers, all in [0,1] will exceed 30 is a massively tiny number.
Even if we massively relax the sum required, this is just not a common event that you are looking to find.
For example, suppose I relax the restriction a bit.
sum(sum(rand(1e7,31),2)>25)
ans =
0
So in a samplesize of 1e7, the sum of 31 uniform unit random numbers never did exceed 25. Howbg does it get for a sample of that size? Perhaps surprisingly low.
max(sum(rand(1e7,31),2))
ans =
23.378
Actually not a surprise to me, not in the slightest.
Now, lets try a large random sample given your bounds. Again, I'll create 10 million such events.
sample = lb + (ub - lb).*rand(1e7,31);
S = sort(sum(sample,2),'descend');
S(1:10)
ans =
18349
17607
17488
17488
17471
17437
17377
17373
17363
17362
So not very large, and thatwas for 10 million such samples. But even worse????? Then you are taking the floor of each number, which will make the sum a bit smaller yet.
If you want to know if it is possible? Well, not very likely. You will need massively huge sample sizes to get sums near the top end. Of course, if the sampling is not done uniformly? So you might choose a biased sampling scheme, where the upper end for each variable is closer to the top. Then such a sum is entirely possible to arise. This gets a bit more complex of course.

More Answers (1)

Torsten
Torsten on 30 Nov 2018
Just copy the answer I gave correctly:
randomArray = lb + rand(1,31).*(ub-lb);
  3 Comments
Torsten
Torsten on 30 Nov 2018
Edited: Torsten on 30 Nov 2018
The value you should expect for the sum of randomArray is sum(0.5*(lb+ub)) since you select random numbers between lb and ub with equal probability. This sum equals 11473.
Afiqah Ismail
Afiqah Ismail on 30 Nov 2018
Can you explain more.... I does not understand it....I want to have a random array that have the sum of the array is higher than 14000. this because i need to create 3 array that the sum is 12000,18000 and 21000 array .Is it possible.....

Sign in to comment.

Categories

Find more on Creating and Concatenating Matrices in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!