Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
how to generate random variable with constraint?

Subject: how to generate random variable with constraint?

From: jay

Date: 20 Jul, 2010 01:53:03

Message: 1 of 33

I need to generate 5 random variables between [0,1], let's say a, b, c, d, e. The constraint is that a<b<c<d<e. How to make this happen? Please advise. thanks

Subject: how to generate random variable with constraint?

From: someone

Date: 20 Jul, 2010 02:05:06

Message: 2 of 33

"jay " <ssjzdl@gmail.com> wrote in message <i22vhv$frn$1@fred.mathworks.com>...
> I need to generate 5 random variables between [0,1], let's say a, b, c, d, e. The constraint is that a<b<c<d<e. How to make this happen? Please advise. thanks

doc rand
doc sort

It seems to me that the only way 5 random numbers wouldn't
satisfy that constraint is if 2 or more of them are equal.
A very unlikely event.

Subject: how to generate random variable with constraint?

From: Walter Roberson

Date: 20 Jul, 2010 04:00:45

Message: 3 of 33

someone wrote:
> "jay " <ssjzdl@gmail.com> wrote in message
> <i22vhv$frn$1@fred.mathworks.com>...
>> I need to generate 5 random variables between [0,1], let's say a, b,
>> c, d, e. The constraint is that a<b<c<d<e. How to make this happen?
>> Please advise. thanks
>
> doc rand
> doc sort
>
> It seems to me that the only way 5 random numbers wouldn't satisfy that
> constraint is if 2 or more of them are equal. A very unlikely event.

Roger Stafford has discussed some limitations on generating random
variables with constraints -- if it is done incorrectly, the values tend
to cluster towards the middle instead of uniformly distributed. I think
sorting was okay, but I'd have to find the previous threads to be sure.

Subject: how to generate random variable with constraint?

From: Roger Stafford

Date: 20 Jul, 2010 07:24:04

Message: 4 of 33

Walter Roberson <roberson@hushmail.com> wrote in message <Ob91o.39684$YX3.15357@newsfe18.iad>...
> Roger Stafford has discussed some limitations on generating random
> variables with constraints -- if it is done incorrectly, the values tend
> to cluster towards the middle instead of uniformly distributed. I think
> sorting was okay, but I'd have to find the previous threads to be sure.
- - - - - - - - - -
  I believe that "someone"'s indicated sort solution is perfectly valid, Walter. I'll admit that I paused a moment before coming to that conclusion, having the problem of random variables with a fixed sum in mind. However, I am convinced that both the solution to that problem and the sort method in this current problem do abide by the same basic principle.

  That principle is that when generating a combination of random variables with some constraint to be placed on them as they are generated, it should be done as though the same random variables with the same statistical relationships were to be performed without constraints except that combinations not satisfying the constraint are always rejected. A simple enough principle.

  In the current problem, doing a sort(rand(5,1)) and then dealing them out in ascending order to a, b, c, d, and e is the perfect equivalent of (though more efficient than) rejecting all values of rand(5,1) which are not already in sorted order. That is because, if the five variables of rand(5,1) were separately rearranged in sorted order in each of the 5! = 120 simplexes of the five-dimensional [0,1]^5 cube and reassigned to a, b, c, d, and e, these would each then have identical statistical distributions throughout that simplex as far as these latter variables are concerned. The 119 that are rejected, if rearranged, are statistically equivalent to the one that is accepted.

  I won't attempt to explain in detail what is involved in the fixed sum random variable generation except to say that one first imagines a small tolerance within which the sum is allowed to vary. One can envision an distribution within these tight limits which would simulate the rejection procedure without doing any rejecting. Finally if this tolerance is allowed to approach zero, the required distribution can be shown to also approach a limit, and it is this limit that can be used in generating such random variables. In that sense this fixed sum generation also adheres to the above principle.

Roger Stafford

Subject: how to generate random variable with constraint?

From: John D'Errico

Date: 20 Jul, 2010 11:17:20

Message: 5 of 33

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i23iuk$lng$1@fred.mathworks.com>...
> Walter Roberson <roberson@hushmail.com> wrote in message <Ob91o.39684$YX3.15357@newsfe18.iad>...
> > Roger Stafford has discussed some limitations on generating random
> > variables with constraints -- if it is done incorrectly, the values tend
> > to cluster towards the middle instead of uniformly distributed. I think
> > sorting was okay, but I'd have to find the previous threads to be sure.
> - - - - - - - - - -
> I believe that "someone"'s indicated sort solution is perfectly valid, Walter. I'll admit that I paused a moment before coming to that conclusion, having the problem of random variables with a fixed sum in mind.

I'd agree with Roger here, although I too had to think
about it to be confident. The constraint here is not a
difficult one to satisfy, and the sort is an adequate
solution.

You can view this problem differently from finding a
list of 5 values that are sorted in increasing order.
Instead, you can view it as finding a SINGLE point,
the coordinates of which represent a point that lives
in a specific 5-dimensional simplex. In the end though,
the sort is valid.

John

Subject: how to generate random variable with constraint?

From: Matt J

Date: 20 Jul, 2010 15:20:08

Message: 6 of 33

"jay " <ssjzdl@gmail.com> wrote in message <i22vhv$frn$1@fred.mathworks.com>...
> I need to generate 5 random variables between [0,1], let's say a, b, c, d, e. The constraint is that a<b<c<d<e. How to make this happen? Please advise. thanks
==========

A=cumsum(rand(1,5))/5;

Subject: how to generate random variable with constraint?

From: Walter Roberson

Date: 20 Jul, 2010 15:23:20

Message: 7 of 33

Matt J wrote:
> "jay " <ssjzdl@gmail.com> wrote in message
> <i22vhv$frn$1@fred.mathworks.com>...
>> I need to generate 5 random variables between [0,1], let's say a, b,
>> c, d, e. The constraint is that a<b<c<d<e. How to make this happen?
>> Please advise. thanks
> ==========
>
> A=cumsum(rand(1,5))/5;

I'm relatively sure that Roger showed in an earlier thread that that
approach produces biased results.

Subject: how to generate random variable with constraint?

From: Matt J

Date: 20 Jul, 2010 15:35:19

Message: 8 of 33

Walter Roberson <roberson@hushmail.com> wrote in message <Jbj1o.16614$Bh2.16201@newsfe04.iad>...

> >
> > A=cumsum(rand(1,5))/5;
>
> I'm relatively sure that Roger showed in an earlier thread that that
> approach produces biased results.
================

Did the OP require that there not be bias?

Subject: how to generate random variable with constraint?

From: Matt J

Date: 20 Jul, 2010 15:58:07

Message: 9 of 33

Walter Roberson <roberson@hushmail.com> wrote in message <Jbj1o.16614$Bh2.16201@newsfe04.iad>...

> >
> > A=cumsum(rand(1,5))/5;
>
> I'm relatively sure that Roger showed in an earlier thread that that
> approach produces biased results.
=======

Another possibility. Still biased?

A=cumsum(rand(1,6));
A=A/sum(A);
A(end)=[];

Subject: how to generate random variable with constraint?

From: Walter Roberson

Date: 20 Jul, 2010 16:00:55

Message: 10 of 33

Matt J wrote:
> Walter Roberson <roberson@hushmail.com> wrote in message
> <Jbj1o.16614$Bh2.16201@newsfe04.iad>...
>
>> > > A=cumsum(rand(1,5))/5;
>>
>> I'm relatively sure that Roger showed in an earlier thread that that
>> approach produces biased results.
> ================
>
> Did the OP require that there not be bias?

You are correct, the OP placed no such restriction in the question,
including not requiring that the random numbers be drawn from a uniform
random distribution.

In terms of what the OP demonstrably asked for, the following would also
be valid:

((0:4) + rand(1,5)) ./ 5

There was, however, some ambiguity in the OP's phrasing.

Subject: how to generate random variable with constraint?

From: Matt J

Date: 20 Jul, 2010 16:17:04

Message: 11 of 33

Walter Roberson <roberson@hushmail.com> wrote in message <YKj1o.92592$Lj2.82402@newsfe05.iad>...

> You are correct, the OP placed no such restriction in the question,
> including not requiring that the random numbers be drawn from a uniform
> random distribution.
===========

Assuming he was, though, I'm still wondering if the following would do it. For larger numbers of variables, it would be good to have a way of doing this without using
sort().

A=cumsum(rand(1,6)); A=A/sum(A); A(end)=[];

Subject: how to generate random variable with constraint?

From: Walter Roberson

Date: 20 Jul, 2010 16:24:10

Message: 12 of 33

Matt J wrote:
> Walter Roberson <roberson@hushmail.com> wrote in message
> <YKj1o.92592$Lj2.82402@newsfe05.iad>...
>
>> You are correct, the OP placed no such restriction in the question,
>> including not requiring that the random numbers be drawn from a
>> uniform random distribution.
> ===========
>
> Assuming he was, though, I'm still wondering if the following would do
> it. For larger numbers of variables, it would be good to have a way of
> doing this without using sort().
>
> A=cumsum(rand(1,6)); A=A/sum(A); A(end)=[];

I would tend to doubt that that would work to generate well-distributed
points on the simplex. The fundamental problem with using the sum
approach is that even though any one A(K) value is independent, as you
add them together, the sum approaches the normal distribution, as per
the Central Limit Theorem, and so the generated A vectors would tend to
cluster towards the centroid of the simplex. I don't see at the moment
how generating an extra value and discarding would resolve that problem.

Subject: how to generate random variable with constraint?

From: someone

Date: 20 Jul, 2010 16:37:04

Message: 13 of 33

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i24i60$gfp$1@fred.mathworks.com>...
> Walter Roberson <roberson@hushmail.com> wrote in message <YKj1o.92592$Lj2.82402@newsfe05.iad>...
>
> > You are correct, the OP placed no such restriction in the question,
> > including not requiring that the random numbers be drawn from a uniform
> > random distribution.
> ===========
>
> Assuming he was, though, I'm still wondering if the following would do it. For larger numbers of variables, it would be good to have a way of doing this without using
> sort().
>
> A=cumsum(rand(1,6)); A=A/sum(A); A(end)=[];

Wow, I have to admit that I didn't put a lot of thought into my inital solution.
I simply reasoned that the constraint that a<b<c<d<e was really no constraint at all.
Using sort was (in my mind) just a way of "relabeling" the a, b c, d, & e variables.
The only "gotcha" would be if rand returned an equality (whiched seemed like
a pretty unlikely event with an "easy" fix). Did I miss something?

Subject: how to generate random variable with constraint?

From: Matt J

Date: 20 Jul, 2010 16:59:04

Message: 14 of 33

Walter Roberson <roberson@hushmail.com> wrote in message <K4k1o.93283$Lj2.50698@newsfe05.iad>...

> I would tend to doubt that that would work to generate well-distributed
> points on the simplex. The fundamental problem with using the sum
> approach is that even though any one A(K) value is independent, as you
> add them together, the sum approaches the normal distribution, as per
> the Central Limit Theorem, and so the generated A vectors would tend to
> cluster towards the centroid of the simplex. I don't see at the moment
> how generating an extra value and discarding would resolve that problem.
==================

I'm not really seeing that argument. We have A/sum(A).
The central limit theorem says that A(end) ---> (randn+.5)*sqrt(N)
The law of large numbers says that sum(A)---> 0.5*N

This means that the A(end)/sum(A)--->0 as N-->inf, as we would expect it to.

Subject: how to generate random variable with constraint?

From: John D'Errico

Date: 20 Jul, 2010 17:10:20

Message: 15 of 33

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i24i60$gfp$1@fred.mathworks.com>...
> Walter Roberson <roberson@hushmail.com> wrote in message <YKj1o.92592$Lj2.82402@newsfe05.iad>...
>
> > You are correct, the OP placed no such restriction in the question,
> > including not requiring that the random numbers be drawn from a uniform
> > random distribution.
> ===========
>
> Assuming he was, though, I'm still wondering if the following would do it. For larger numbers of variables, it would be good to have a way of doing this without using
> sort().
>
> A=cumsum(rand(1,6)); A=A/sum(A); A(end)=[];

This is indeed massively biased! To convince yourself
that it does not produce a random sampling, or even
the correct sampling of the required domain, try it in
2 dimensions!

n = 10000;
A = cumsum(rand(n,3),2);
A = bsxfun(@rdivide,A,sum(A,2));
A(:,3) = [];
plot(A(:,1),A(:,2),'.')

The domain of interest here SHOULD be a triangle,
but not the one shown. Instead, try this:

B = sort(rand(n,2),2);
plot(B(:,1),B(:,2),'.')

I don't even see the sort as more complex, nor does
MATLAB. Try this:

n = 1000000;
tic
A = cumsum(rand(n,6),2);
A = bsxfun(@rdivide,A,sum(A,2));
A(:,6) = [];
toc
Elapsed time is 0.219178 seconds.

tic
B = sort(rand(n,5),2);
toc
Elapsed time is 0.163978 seconds.

See that the sort took LESS time than the cumsum,
and the sort is verifiably correct.

John

Subject: how to generate random variable with constraint?

From: Roger Stafford

Date: 20 Jul, 2010 17:53:04

Message: 16 of 33

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i24i60$gfp$1@fred.mathworks.com>...
> Assuming he was, though, I'm still wondering if the following would do it. For larger numbers of variables, it would be good to have a way of doing this without using
> sort().
>
> A=cumsum(rand(1,6)); A=A/sum(A); A(end)=[];
- - - - - - - - -
  Matt, I'm in agreement with John and Walter on this. Your two solutions are not good ones to give as an answer even if Jay did not give specific details as to the desired distribution.

  For example in your second method with five variables it is impossible for the fifth one to exceed 1/2 even though doing so would easily be compatible with Jay's constraint.

  However, just as important, the distribution within the domain which is used is exceedingly non-uniform. If you were to show Jay a plot for the two variable case, I am sure he would not be happy with it. The area used is actually less than half of that which would be allowed by the constraint, with everything having to lie below the line 3*a+2*b = 1 and the density drops off to zero at the [0,0] corner.

Roger Stafford

Subject: how to generate random variable with constraint?

From: Matt J

Date: 20 Jul, 2010 18:06:04

Message: 17 of 33

"John D'Errico" <woodchips@rochester.rr.com> wrote in message <i24l9s$auq$1@fred.mathworks.com>...

> I don't even see the sort as more complex, nor does
> MATLAB. Try this:
===========

That's only because you're working in 2 dimensions. In higher dimensions, complexity theory of sorting vs. summation gaurantees that sorting will do worse. Try this modification:


n = 100;
data=rand(n,60000);
tic
A = cumsum(data,2);
A = bsxfun(@rdivide,A,sum(A,2));
A(:,end) = [];
toc
%Elapsed time is 0.141225 seconds.


data=rand(n,59999);
tic
B = sort(data,2);
toc
%Elapsed time is 0.493895 seconds.

Subject: how to generate random variable with constraint?

From: Matt J

Date: 20 Jul, 2010 18:23:20

Message: 18 of 33

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i24nq0$oat$1@fred.mathworks.com>...

> - - - - - - - - -
> Matt, I'm in agreement with John and Walter on this. Your two solutions are not good ones to give as an answer even if Jay did not give specific details as to the desired distribution.
>
> For example in your second method with five variables it is impossible for the fifth one to exceed 1/2 even though doing so would easily be compatible with Jay's constraint.
=========

Roger- You're right. I had a mistake. What I really meant to give was this:

A=cumsum(rand(1,6)); A=A/A(end); A(end)=[];

I reran John's 2D test on this and find that it covers the correct triangular area, though slightly less uniformly than the sorting method.

Again, though, for me, this was all just an exercise in seeing if we could get something nearly as good using cheaper summations instead of sorting.

Subject: how to generate random variable with constraint?

From: Roger Stafford

Date: 20 Jul, 2010 18:29:21

Message: 19 of 33

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i24oic$env$1@fred.mathworks.com>...
> "John D'Errico" <woodchips@rochester.rr.com> wrote in message <i24l9s$auq$1@fred.mathworks.com>...
>
> > I don't even see the sort as more complex, nor does
> > MATLAB. Try this:
> ===========
>
> That's only because you're working in 2 dimensions. In higher dimensions, complexity theory of sorting vs. summation gaurantees that sorting will do worse. Try this modification:
>
>
> n = 100;
> data=rand(n,60000);
> tic
> A = cumsum(data,2);
> A = bsxfun(@rdivide,A,sum(A,2));
> A(:,end) = [];
> toc
> %Elapsed time is 0.141225 seconds.
>
>
> data=rand(n,59999);
> tic
> B = sort(data,2);
> toc
> %Elapsed time is 0.493895 seconds.
- - - - - - - - -
  Matt, you shouldn't give people solutions that are distinctly inferior just because their code would run faster. That is putting too much emphasis on speed. Quality counts too. Both John and I have already shown you how far off the mark the plot of the two-variable case would be for your solution. It doesn't get any better as the number of variables increases where the speed difference would become significant.

Roger Stafford

Subject: how to generate random variable with constraint?

From: John D'Errico

Date: 20 Jul, 2010 18:40:34

Message: 20 of 33

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i24oic$env$1@fred.mathworks.com>...
> "John D'Errico" <woodchips@rochester.rr.com> wrote in message <i24l9s$auq$1@fred.mathworks.com>...
>
> > I don't even see the sort as more complex, nor does
> > MATLAB. Try this:
> ===========
>
> That's only because you're working in 2 dimensions. In higher dimensions, complexity theory of sorting vs. summation gaurantees that sorting will do worse. Try this modification:
>

No, if you bothered to look at my test, it was done
in 5 dimensions. Yes, if you solve a very different
problem from that which was asked about, the time
will be different.

John

Subject: how to generate random variable with constraint?

From: Matt J

Date: 20 Jul, 2010 18:50:20

Message: 21 of 33

"John D'Errico" <woodchips@rochester.rr.com> wrote in message <i24qj2$pi6$1@fred.mathworks.com>...

>
> No, if you bothered to look at my test, it was done
> in 5 dimensions. Yes, if you solve a very different
> problem from that which was asked about, the time
> will be different.
========

Then we have no disagreement.

In case my earlier posts weren't clear, I'm no longer all that concerned with the specific case raised by the OP (no offense, Jay). I'm more interested in how we might do this more cheaply if we wanted to do it in higher dimensions (and why wouldn't we?).

Since the OP already has been given at least one solution that will work, I think it's not unfair to allow the thread to stray to related tangents...

Subject: how to generate random variable with constraint?

From: Roger Stafford

Date: 20 Jul, 2010 19:06:19

Message: 22 of 33

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i24pio$k5f$1@fred.mathworks.com>...
> Roger- You're right. I had a mistake. What I really meant to give was this:
>
> A=cumsum(rand(1,6)); A=A/A(end); A(end)=[];
>
> I reran John's 2D test on this and find that it covers the correct triangular area, though slightly less uniformly than the sorting method.
>
> Again, though, for me, this was all just an exercise in seeing if we could get something nearly as good using cheaper summations instead of sorting.
- - - - - - - - - - -
  Matt I checked out the two-dimensional plot for your revised code:

 A=cumsum(rand(1,6)); A=A/A(end); A(end)=[];

It does cover the correct triangle. However it is grossly inaccurate to say that it is only "slightly less uniform than the sorting method." The probability area density actually drops down to zero at each corner of that triangle, whereas in a good solution it ought to be a uniform plateau throughout the entire triangle's area. This disparity would continue to worsen as the number of variables increases, though unfortunately it is difficult to illustrate this fact with plots.

Roger Stafford

Subject: how to generate random variable with constraint?

From: Matt J

Date: 20 Jul, 2010 19:09:06

Message: 23 of 33

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i24pu1$d60$1@fred.mathworks.com>...

> Matt, you shouldn't give people solutions that are distinctly inferior just because their code would run faster.
================

Roger, see my revision in Message #18. It seems to be a better contender.

In any case, yes, I would hate for Jay to walk away without knowing the limitations of the solutions we propose, but I'm still feeling my through it myself.

Even if I haven't figured out exactly how, it seems distinctly intuitive that we should be able to derive this with cumsum because the jumps between a<b<c<d,etc...
form a positive-valued Markov process
(like cumsum(rand(1,N)). So you would think it possible to derive the solution from this.

Should I be brainstorming out loud on the NG? Debatable, but I've seen lots of people here do it...

Subject: how to generate random variable with constraint?

From: Matt J

Date: 20 Jul, 2010 19:40:22

Message: 24 of 33

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i24s3b$58o$1@fred.mathworks.com>...

> It does cover the correct triangle. However it is grossly inaccurate to say that it is only "slightly less uniform than the sorting method."
==============

I was going by an eyeball assessment of John's plots. Those plots don't give a full picture of the distribution, but only salt-and-pepper sampling patterns (which were slightly more salty than peppery for the cumsum method).


The probability area density actually drops down to zero at each corner of that triangle, whereas in a good solution it ought to be a uniform plateau throughout the entire triangle's area. This disparity would continue to worsen as the number of variables increases, though unfortunately it is difficult to illustrate this fact with plots.
===========

So you're saying it's more like a Gaussian distribution over the triangle? That's strange. However, surely this is a much more reasonable contender than my earlier version, considering (a) that Jay never said whether he was interested in a uniform or a Gaussian distribution and (b) that it's more efficient to generate.

 

Subject: how to generate random variable with constraint?

From: Roger Stafford

Date: 20 Jul, 2010 23:49:05

Message: 25 of 33

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i24u36$g1a$1@fred.mathworks.com>...
> ............
> So you're saying it's more like a Gaussian distribution over the triangle? That's strange. However, surely this is a much more reasonable contender than my earlier version, considering (a) that Jay never said whether he was interested in a uniform or a Gaussian distribution and (b) that it's more efficient to generate.
- - - - - - - - -
  Yes I agree your revised version is a more reasonable contender than the previous one, Matt. I used your earlier code blindly but should have realized that you surely meant something else.

  This later version is rather similar in a sense to the solutions that have been given often in this group for the problem of n random variables with a predetermined sum which I mentioned earlier, where n rand's are taken and then they are each divided by their sum times the desired sum value. Both techniques tend to concentrate values in the center regions at the expense of the outer regions - that is disproportionately to the n-dimensonal volumes of those regions. And yes for large n they begin to approach gaussian distributions (the central limit theorem at work again.)

  For that reason such methods don't satisfy the principle I mentioned earlier of generating the variables in such a manner that they are equivalent, statistically speaking, to a process that generates the variables without constraints but then rejects all that don't satisfy the constraints.

Roger Stafford

Subject: how to generate random variable with constraint?

From: Matt J

Date: 21 Jul, 2010 10:02:10

Message: 26 of 33

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i25clh$jhi$1@fred.mathworks.com>...

> This later version is rather similar in a sense to the solutions that have been given often in this group for the problem of n random variables with a predetermined sum which I mentioned earlier, where n rand's are taken and then they are each divided by their sum times the desired sum value. Both techniques tend to concentrate values in the center regions at the expense of the outer regions - that is disproportionately to the n-dimensonal volumes of those regions. And yes for large n they begin to approach gaussian distributions (the central limit theorem at work again.)
=======

As I was saying to Walter, I really don't follow that reasoning. If we're now talking about

Z=rand(1,N);
X=Z/sum(Z);

it is quite clear that each X(i)--->0 as N--->inf. Also, X does not result from the summation of i.i.d random vectors for any N, so I don't see how the Central Limit Theorem would imply that the elements X(i) tend to be jointly Gaussian as N-->inf.

Subject: how to generate random variable with constraint?

From: Roger Stafford

Date: 21 Jul, 2010 19:47:04

Message: 27 of 33

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i26gj2$s3o$1@fred.mathworks.com>...
> "Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i25clh$jhi$1@fred.mathworks.com>...
>
> > This later version is rather similar in a sense to the solutions that have been given often in this group for the problem of n random variables with a predetermined sum which I mentioned earlier, where n rand's are taken and then they are each divided by their sum times the desired sum value. Both techniques tend to concentrate values in the center regions at the expense of the outer regions - that is disproportionately to the n-dimensonal volumes of those regions. And yes for large n they begin to approach gaussian distributions (the central limit theorem at work again.)
> =======
>
> As I was saying to Walter, I really don't follow that reasoning. If we're now talking about
>
> Z=rand(1,N);
> X=Z/sum(Z);
>
> it is quite clear that each X(i)--->0 as N--->inf. Also, X does not result from the summation of i.i.d random vectors for any N, so I don't see how the Central Limit Theorem would imply that the elements X(i) tend to be jointly Gaussian as N-->inf.
- - - - - - - - - - -
  As I understand it, Matt, the argument would go something like this. In your notation, for each individual Z(i), when it is divided by sum(Z) for N > 1, that affects the distribution of the resulting X(i). It no longer possesses its original uniform distribution on [0,1] (or whatever distribution it might have had.) Theoretically it can still range from 0 to 1 but statistically it is crowded more and more closely in towards 0 for increasing N. Its theoretical mean and variance can be calculated as a function of N. If we were to translate and rescale it so as to have mean zero and variance one, its distribution would begin to resemble more and more the bell-shaped curve of a standard normal distribution as N increases - and yes stretching out towards infinity in both plus and minus directions. And this is actually independent of whatever distribution the original Z's possessed,
assuming they were independent. So apparently says the mysterious central limit theorem in one of its numerous manifestations (please don't ask me which one.) You will note that this is not in contradiction with the fact that the mean value of X(i) must itself approach zero as N approaches infinity. It just says that if you continue to shift and rescale it so as to match standard normal in mean and variance for each N, then the rest of the distribution curve will also approach normality. The CLT is a remarkable theorem.

Roger Stafford

Subject: how to generate random variable with constraint?

From: Paulo

Date: 22 Jul, 2010 10:44:04

Message: 28 of 33

"jay " <ssjzdl@gmail.com> wrote in message <i22vhv$frn$1@fred.mathworks.com>...
> I need to generate 5 random variables between [0,1], let's say a, b, c, d, e. The constraint is that a<b<c<d<e. How to make this happen? Please advise. thanks


while(1) %infinite loop
a=rand();b=rand();c=rand();d=rand();e=rand(); %put random values on variables
if((a<b) && (b<c) && (c<d) && (d<e)) , break ,end %test if constraint is true
%if constraint is true than stop the loop
end
fprintf('a=%d b=%d c=%d d=%d',a,b,c,d); % just to show the values

Subject: how to generate random variable with constraint?

From: John D'Errico

Date: 22 Jul, 2010 13:27:07

Message: 29 of 33

"Paulo " <paulojmdsilva@gmail.com> wrote in message <i297dj$bd7$1@fred.mathworks.com>...
> "jay " <ssjzdl@gmail.com> wrote in message <i22vhv$frn$1@fred.mathworks.com>...
> > I need to generate 5 random variables between [0,1], let's say a, b, c, d, e. The constraint is that a<b<c<d<e. How to make this happen? Please advise. thanks
>
>
> while(1) %infinite loop
> a=rand();b=rand();c=rand();d=rand();e=rand(); %put random values on variables
> if((a<b) && (b<c) && (c<d) && (d<e)) , break ,end %test if constraint is true
> %if constraint is true than stop the loop
> end
> fprintf('a=%d b=%d c=%d d=%d',a,b,c,d); % just to show the values

This is a rejection method, quite an inefficient way to
do the same thing as a sort on this problem. Since
there are 5! = 24 ways to generate a set of 5 numbers,
only ONE of which is sorted, this dumps its results into
the bit bucket nearly 96% of the time.

23/24
ans =
         0.958333333333333

It makes far more sense to generate 5 numbers (as a
vector, in ONE operation) and then use sort on them.

sort(rand(1,5))

The statistics of this result are the same in the end, yet
it wastes far less cpu time to do the operation.

John

Subject: how to generate random variable with constraint?

From: Paulo

Date: 22 Jul, 2010 14:43:20

Message: 30 of 33

"John D'Errico" <woodchips@rochester.rr.com> wrote in message <i29gvb$cs2$1@fred.mathworks.com>...
> "Paulo " <paulojmdsilva@gmail.com> wrote in message <i297dj$bd7$1@fred.mathworks.com>...
> > "jay " <ssjzdl@gmail.com> wrote in message <i22vhv$frn$1@fred.mathworks.com>...
> > > I need to generate 5 random variables between [0,1], let's say a, b, c, d, e. The constraint is that a<b<c<d<e. How to make this happen? Please advise. thanks
> >
> >
> > while(1) %infinite loop
> > a=rand();b=rand();c=rand();d=rand();e=rand(); %put random values on variables
> > if((a<b) && (b<c) && (c<d) && (d<e)) , break ,end %test if constraint is true
> > %if constraint is true than stop the loop
> > end
> > fprintf('a=%d b=%d c=%d d=%d',a,b,c,d); % just to show the values
>
> This is a rejection method, quite an inefficient way to
> do the same thing as a sort on this problem. Since
> there are 5! = 24 ways to generate a set of 5 numbers,
> only ONE of which is sorted, this dumps its results into
> the bit bucket nearly 96% of the time.
>
> 23/24
> ans =
> 0.958333333333333
>
> It makes far more sense to generate 5 numbers (as a
> vector, in ONE operation) and then use sort on them.
>
> sort(rand(1,5))
>
> The statistics of this result are the same in the end, yet
> it wastes far less cpu time to do the operation.
>
> John

That's a good point but unless he wants to create those variables many many times it doesn't matter much, also there's no problem with variables when they got the same value, my code is simple and doesn't require any fancy manipulations or functions.

Subject: how to generate random variable with constraint?

From: someone

Date: 22 Jul, 2010 16:26:04

Message: 31 of 33

"Paulo " <paulojmdsilva@gmail.com> wrote in message <i29le8$2pa$1@fred.mathworks.com>...
> "John D'Errico" <woodchips@rochester.rr.com> wrote in message <i29gvb$cs2$1@fred.mathworks.com>...
> > "Paulo " <paulojmdsilva@gmail.com> wrote in message <i297dj$bd7$1@fred.mathworks.com>...
> > > "jay " <ssjzdl@gmail.com> wrote in message <i22vhv$frn$1@fred.mathworks.com>...
> > > > I need to generate 5 random variables between [0,1], let's say a, b, c, d, e. The constraint is that a<b<c<d<e. How to make this happen? Please advise. thanks
> > >
> > >
> > > while(1) %infinite loop
> > > a=rand();b=rand();c=rand();d=rand();e=rand(); %put random values on variables
> > > if((a<b) && (b<c) && (c<d) && (d<e)) , break ,end %test if constraint is true
> > > %if constraint is true than stop the loop
> > > end
> > > fprintf('a=%d b=%d c=%d d=%d',a,b,c,d); % just to show the values
> >
> > This is a rejection method, quite an inefficient way to
> > do the same thing as a sort on this problem. Since
> > there are 5! = 24 ways to generate a set of 5 numbers,
> > only ONE of which is sorted, this dumps its results into
> > the bit bucket nearly 96% of the time.
> >
> > 23/24
> > ans =
> > 0.958333333333333
> >
> > It makes far more sense to generate 5 numbers (as a
> > vector, in ONE operation) and then use sort on them.
> >
> > sort(rand(1,5))
> >
> > The statistics of this result are the same in the end, yet
> > it wastes far less cpu time to do the operation.
> >
> > John
>
> That's a good point but unless he wants to create those variables many many times it doesn't matter much,

This is a little out of my expertise, but it COULD matter. I believe your code is an example of "indefinite postponement". It is possible you might go through your while loop an infinite number of times before you finally find a solution that breaks out of the loop.

When John said (correct me if I'm wrong) "96% of the time" I believe that is an average number. There is no guarantee.

> also there's no problem with variables when they got the same value, my code is simple and doesn't require any fancy manipulations or functions.

Do you really think the while loop is simplier than the one liner solution?

Subject: how to generate random variable with constraint?

From: Walter Roberson

Date: 22 Jul, 2010 18:43:22

Message: 32 of 33

someone wrote:
> "Paulo " <paulojmdsilva@gmail.com> wrote in message
> <i29le8$2pa$1@fred.mathworks.com>...

>> > > > > > > while(1) %infinite loop
>> > > a=rand();b=rand();c=rand();d=rand();e=rand(); %put random values
>> on variables
>> > > if((a<b) && (b<c) && (c<d) && (d<e)) , break ,end %test if
>> constraint is true
>> > > %if constraint is true than stop the loop
>> > > end
>> > > fprintf('a=%d b=%d c=%d d=%d',a,b,c,d); % just to show the values

> I believe
> your code is an example of "indefinite postponement". It is possible
> you might go through your while loop an infinite number of times before
> you finally find a solution that breaks out of the loop.
> When John said (correct me if I'm wrong) "96% of the time" I believe
> that is an average number. There is no guarantee.

This is a situation of a geometric distribution with probability of success p
= 1/24 . The "expected value" of such a distribution is success on the 1/p'th
trial (that is, the 24th trial in this case), but Yes, it does have an
infinite tail. An infinite tail will occur for all rejection methods whose
trials are independent and of finite non-unitary probability.

The sort method would succeed in any situation in which no two random values
were the same. The probability of identical random values within N trials
depends substantially upon the details of the psuedo-random number generator.
The old linear congruential generator could not produce duplicate numbers in
less than 2^32 trials. I do not know how to analyze the subtract-with-borrow
generator. The Mersenne Twister generator has been analyzed to be
statistically independent up to 633 dimensions (I think the number is); I
calculate that the probability of failure for it would be
99035202923775336283058995157/42535295785889145473997720539375861762
which is about 2.3E-9, leading to an average number of trials of 1 + 2.3E-9

Subject: how to generate random variable with constraint?

From: Siddhartha

Date: 27 Mar, 2013 21:09:05

Message: 33 of 33

I have a similar question.

I have five random variables. They are discrete and can take a 'low', 'base' or 'high' value between 15 and 35. The only condition is that their sum never exceeds 98. It is one thing to scale the sum to fit the given range, but how do I simulate the individual variables for this?

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us