Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Random sequence with descending probabilities

Subject: Random sequence with descending probabilities

From: Richard

Date: 1 Oct, 2010 04:34:08

Message: 1 of 8

Hi

I want to create small sets of random integers - say 2- 23, and or 20 - 39.

I would like to have the lower number have a higher probability than the higher number,
with some sort of exponentially decreasing curve.

Here's one idea that I tried:

x = rand(1,10000);

y = x .^2;

lower = 5;
upper = 20;

y = lower + (y * (upper - lower));

y = round(y);

mn = min(y);
mx = max(y);

figure,hist(y,(upper - lower) + 1);

I wonder if this approach is robust? Will there be gaps in this histogram for some ranges?

I'm not sure why it seems to work, which is why I guess I'm unsure of it.

Thanks,

Rick

Subject: Random sequence with descending probabilities

From: Roger Stafford

Date: 1 Oct, 2010 05:49:08

Message: 2 of 8

"Richard " <rfrank@dominionsw.com> wrote in message <i83oc0$ann$1@fred.mathworks.com>...
> Hi
>
> I want to create small sets of random integers - say 2- 23, and or 20 - 39.
>
> I would like to have the lower number have a higher probability than the higher number,
> with some sort of exponentially decreasing curve.
>
> Here's one idea that I tried:
>
> x = rand(1,10000);
>
> y = x .^2;
>
> lower = 5;
> upper = 20;
>
> y = lower + (y * (upper - lower));
>
> y = round(y);
>
> mn = min(y);
> mx = max(y);
>
> figure,hist(y,(upper - lower) + 1);
>
> I wonder if this approach is robust? Will there be gaps in this histogram for some ranges?
>
> I'm not sure why it seems to work, which is why I guess I'm unsure of it.
>
> Thanks,
>
> Rick
- - - - - - - -
  I would not use the 'round' function here since that would give an anomalous distribution at the two endpoints. Also what you used was not an exponential curve but a power curve - in this case, the power of 2. For an exponential distribution curve I would do the following. Let U and L be your two integer-valued range extremes.

 y = floor(L*((U+1)/L).^rand(1,n));

The values of y should all be integers and range from L to U.

Roger Stafford

Subject: Random sequence with descending probabilities

From: Roger Stafford

Date: 1 Oct, 2010 06:37:07

Message: 3 of 8

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i83sok$l9n$1@fred.mathworks.com>...
> I would not use the 'round' function here since that would give an anomalous distribution at the two endpoints. Also what you used was not an exponential curve but a power curve - in this case, the power of 2. For an exponential distribution curve I would do the following. Let U and L be your two integer-valued range extremes.
>
> y = floor(L*((U+1)/L).^rand(1,n));
>
> The values of y should all be integers and range from L to U.
>
> Roger Stafford
- - - - - - - -
  My apologies! I was a bit hasty in my description of those curves. Mine actually gives a probability density approximately proportional to the reciprocal of y for y between L and U. Your curve density is approximately proportional to the reciprocal of the square root of y-L for y between L and U, which means in principle that it is infinite at y = L. In both cases, lower values of y certainly do have higher probabilities.

  (That will teach me not to depend too much on thinking clearly late at night.)

Roger Stafford

Subject: Random sequence with descending probabilities

From: Jos (10584)

Date: 1 Oct, 2010 12:27:05

Message: 4 of 8

"Richard " <rfrank@dominionsw.com> wrote in message <i83oc0$ann$1@fred.mathworks.com>...
> Hi
>
> I want to create small sets of random integers - say 2- 23, and or 20 - 39.
>
> I would like to have the lower number have a higher probability than the higher number,
> with some sort of exponentially decreasing curve.
>
> Here's one idea that I tried:
>
> x = rand(1,10000);
>
> y = x .^2;
>
> lower = 5;
> upper = 20;
>
> y = lower + (y * (upper - lower));
>
> y = round(y);
>
> mn = min(y);
> mx = max(y);
>
> figure,hist(y,(upper - lower) + 1);
>
> I wonder if this approach is robust? Will there be gaps in this histogram for some ranges?
>
> I'm not sure why it seems to work, which is why I guess I'm unsure of it.
>
> Thanks,
>
> Rick

Take a look at RANDP which draws indices from any user-defined PDF:
http://www.mathworks.com/matlabcentral/fileexchange/8891

S = 2:23 ; % set of possible numbers
relativePDF = exp(-(S - 2) / 12) ; % change to modify PDF

N = 10000 ; % number of random entries
ix = randp(relativePDF, 1,N) ; % draw indices
val = S(ix) ; % get corresponding values

bar(S,100*histc(val, S) / N) % show distribution
xlabel('value') ; ylabel ('%')

hth
Jos

Subject: Random sequence with descending probabilities

From: Richard

Date: 1 Oct, 2010 13:38:08

Message: 5 of 8

Hi Roger,

Thanks for the help.

I'm not sure that round() isn't ok here...? It seems to be working as I would expect....

There's a couple of things that I think need some tweaks

y = floor(L*((U+1)/L).^rand(1,n));

will always return numbers between 1 and L (because n^0 is 1) - rand here is returning a 1 by n matrix of numbers 0....1

and also L cannot be 0 - I failed to mention that 0 - n is a valid range....
so........ I tweaked things and came up with the code below - one of my concerns
was that there be no numbers in the range that end up (because of rounding or something) with a 0 probability. All integers in the range must have > 0 probability.....

I have to port this code to another language so I can rely on any matlab particular functions (except for just error checking)...

This seems to work in my test cases so far.....let me know what you think...

Thanks!

Rick
clear;

% change for test cases....
lower = 5;
upper = 10;

power = 1;
exponential = 2;

which = exponential;

n = 10000;
x = rand(1,n);
 
if which == exponential
   y = x .^ (1 + rand(1,n));
   y = lower + (y * ((upper - lower)));
   y = round(y);
   
else
   y = x .^2;
   y = lower + (y * ((upper - lower)));
   y = round(y);

end;
figure(1),plot(sort(y));
mn = min(y);
mx = max(y);

figure(2),hist(y,(upper - lower) + 1);

x = lower:1:upper
counts = histc(y,lower:1:upper);
if any(counts <= 0.5)
    msg = 'Error = histogram gap'
else
    msg = 'OK no histogram gap'
end

 csum = cumsum(counts);
 figure(3);
bar(x,csum,'BarWidth',1)



"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i83sok$l9n$1@fred.mathworks.com>...
> "Richard " <rfrank@dominionsw.com> wrote in message <i83oc0$ann$1@fred.mathworks.com>...
> > Hi
> >
> > I want to create small sets of random integers - say 2- 23, and or 20 - 39.
> >
> > I would like to have the lower number have a higher probability than the higher number,
> > with some sort of exponentially decreasing curve.
> >
> > Here's one idea that I tried:
> >
> > x = rand(1,10000);
> >
> > y = x .^2;
> >
> > lower = 5;
> > upper = 20;
> >
> > y = lower + (y * (upper - lower));
> >
> > y = round(y);
> >
> > mn = min(y);
> > mx = max(y);
> >
> > figure,hist(y,(upper - lower) + 1);
> >
> > I wonder if this approach is robust? Will there be gaps in this histogram for some ranges?
> >
> > I'm not sure why it seems to work, which is why I guess I'm unsure of it.
> >
> > Thanks,
> >
> > Rick
> - - - - - - - -
> I would not use the 'round' function here since that would give an anomalous distribution at the two endpoints. Also what you used was not an exponential curve but a power curve - in this case, the power of 2. For an exponential distribution curve I would do the following. Let U and L be your two integer-valued range extremes.
>
> y = floor(L*((U+1)/L).^rand(1,n));
>
> The values of y should all be integers and range from L to U.
>
> Roger Stafford

Subject: Random sequence with descending probabilities

From: Thomas Vanaret

Date: 1 Oct, 2010 15:46:04

Message: 6 of 8

If you have the Statistics Toolbox, you could use the "randsample" function.

It returns a k by 1 vector of values sampled in a given population of values, with or without replacement.
A last optionnal argument allow you to specify a vector of probability of each value in the given population.

I think it could fit your need.

Subject: Random sequence with descending probabilities

From: Roger Stafford

Date: 1 Oct, 2010 22:23:05

Message: 7 of 8

"Richard " <rfrank@dominionsw.com> wrote in message <i84o80$4cg$1@fred.mathworks.com>...
> y = floor(L*((U+1)/L).^rand(1,n));
> will always return numbers between 1 and L (because n^0 is 1) - rand here is returning a 1 by n matrix of numbers 0....1

  I disagree with you there. The quantity L*((U+1)/L).^rand(1,n)) will return values from a tiny bit above L to a tiny bit below U+1. These correspond to L*((U+1)/L).^0) = L*1 = L, and L*((U+1)/L).^1) = L*((U+1)/L) = U+1. If L and U are integers, after the 'floor' operations these become integers from L up to and including U.

> I'm not sure that round() isn't ok here...? It seems to be working as I would expect....

  As for your use of 'round', the following is what I referred to. Suppose you had written

x = rand(1,10000);
lower = 5;
upper = 20;
y = lower + (x * (upper - lower));
z = round(y);

Since 'lower' and 'upper' are integers and since 'rand' generates numbers within the real interval 0 to 1, this means that y ranges uniformly over the real numbers from 5 to 20, which is an interval of length 15. One-fifteenth (on average) of the random numbers in y will fall between 5.5 and 6.5, and similarly for intervals 6.5 to 7.5, 7.5 to 8.5, ..., and 18.5 to 19.5, and the corresponding rounded z's will be the integers 6, 7, 8, ..., 19, so each of these have a probability of 1/15. However only one-thirtieth will fall between 5 and 5.5 and also one-thirtieth between 19.5 and 20, which means that these two endpoints, 5 and 20, each have half the probability of the other integers. That is the result of using 'rand' in this way.

  Of course you actually used y = x^2 which as I pointed out would give a theoretical infinite density at x = 0 and y = 5, so you wouldn't notice an anomaly with the statistics at y = 5, but you should be able to see it for the value y = 20. It should have only about half the probability of the 19 next to it, using the 'round' operation in the way that you have. Try using 'rand' with 10^6 numbers to get more reliable statistics and see if that isn't true.

> and also L cannot be 0 - I failed to mention that 0 - n is a valid range....

  You are right that setting L equal to zero would invalidate the example I gave. However the following (which was developed after a good night's sleep) should be far more useful for your purposes.

  You have said you wanted "some sort of exponentially decreasing curve." You can easily create a random number generation of the integers from 5 to 20, for example, whose probabilities fall off exponentially, using just the 'rand' function. The idea is to decide what the desired cumulative probability distribution (cdf), F(x), is in a corresponding interval of real numbers and then find its inverse function. By feeding the output of 'rand' into this inverse function, you will produce the desired distribution.

  First we deal with real numbers from 5 to 21 (having a later 'floor' in mind.) We desire the probability density to be of the form C*exp(-k*y) for some appropriate decay factor k for y in the range 5 to 21. Then its cdf must be:

 x = F(y) = (exp(-k*y)-exp(-k*5)) / (exp(-k*21)-exp(-k*5))

The inverse of this is easily found to be:

 y = F^(-1)(x) = -1/k*log( exp(-k*5)+x*(exp(-k*21)-exp(-k*5)) )

If we generate values for x from 'rand', then y will have the desired distribution. If you then apply 'floor' to these, the result will be integers from 5 to 20 with the desired decreasing exponential characteristic, the rate of decay depending on the choice of k. You can do all this in one line:

 y = floor(-1/k*log(exp(-k*5)+rand(1,n)*(exp(-k*21)-exp(-k*5))));

  I tried this out for k = 0.2 and an n of three million or so. Its histogram gives a quite smooth exponential-looking curve if plotted.

Roger Stafford

Subject: Random sequence with descending probabilities

From: Richard

Date: 1 Oct, 2010 23:01:20

Message: 8 of 8

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i85n09$9s8$1@fred.mathworks.com>...
> "Richard " <rfrank@dominionsw.com> wrote in message <i84o80$4cg$1@fred.mathworks.com>...
> > y = floor(L*((U+1)/L).^rand(1,n));
> > will always return numbers between 1 and L (because n^0 is 1) - rand here is returning a 1 by n matrix of numbers 0....1
>
> I disagree with you there. The quantity L*((U+1)/L).^rand(1,n)) will return values from a tiny bit above L to a tiny bit below U+1. These correspond to L*((U+1)/L).^0) = L*1 = L, and L*((U+1)/L).^1) = L*((U+1)/L) = U+1. If L and U are integers, after the 'floor' operations these become integers from L up to and including U.
>

Oh I see...yes that's right. I misread the parens....
> > I'm not sure that round() isn't ok here...? It seems to be working as I would expect....
>
> As for your use of 'round', the following is what I referred to. Suppose you had written
>
> x = rand(1,10000);
> lower = 5;
> upper = 20;
> y = lower + (x * (upper - lower));
> z = round(y);
>
> Since 'lower' and 'upper' are integers and since 'rand' generates numbers within the real interval 0 to 1, this means that y ranges uniformly over the real numbers from 5 to 20, which is an interval of length 15. One-fifteenth (on average) of the random numbers in y will fall between 5.5 and 6.5, and similarly for intervals 6.5 to 7.5, 7.5 to 8.5, ..., and 18.5 to 19.5, and the corresponding rounded z's will be the integers 6, 7, 8, ..., 19, so each of these have a probability of 1/15. However only one-thirtieth will fall between 5 and 5.5 and also one-thirtieth between 19.5 and 20, which means that these two endpoints, 5 and 20, each have half the probability of the other integers. That is the result of using 'rand' in this way.
>
Ah I see. Yes thanks.
> Of course you actually used y = x^2 which as I pointed out would give a theoretical infinite density at x = 0 and y = 5, so you wouldn't notice an anomaly with the statistics at y = 5, but you should be able to see it for the value y = 20. It should have only about half the probability of the 19 next to it, using the 'round' operation in the way that you have. Try using 'rand' with 10^6 numbers to get more reliable statistics and see if that isn't true.
>
> > and also L cannot be 0 - I failed to mention that 0 - n is a valid range....
>
> You are right that setting L equal to zero would invalidate the example I gave. However the following (which was developed after a good night's sleep) should be far more useful for your purposes.
>
> You have said you wanted "some sort of exponentially decreasing curve." You can easily create a random number generation of the integers from 5 to 20, for example, whose probabilities fall off exponentially, using just the 'rand' function. The idea is to decide what the desired cumulative probability distribution (cdf), F(x), is in a corresponding interval of real numbers and then find its inverse function. By feeding the output of 'rand' into this inverse function, you will produce the desired distribution.
>
> First we deal with real numbers from 5 to 21 (having a later 'floor' in mind.) We desire the probability density to be of the form C*exp(-k*y) for some appropriate decay factor k for y in the range 5 to 21. Then its cdf must be:
>
> x = F(y) = (exp(-k*y)-exp(-k*5)) / (exp(-k*21)-exp(-k*5))
>
> The inverse of this is easily found to be:
>
> y = F^(-1)(x) = -1/k*log( exp(-k*5)+x*(exp(-k*21)-exp(-k*5)) )
>
> If we generate values for x from 'rand', then y will have the desired distribution. If you then apply 'floor' to these, the result will be integers from 5 to 20 with the desired decreasing exponential characteristic, the rate of decay depending on the choice of k. You can do all this in one line:
>
> y = floor(-1/k*log(exp(-k*5)+rand(1,n)*(exp(-k*21)-exp(-k*5))));
>
> I tried this out for k = 0.2 and an n of three million or so. Its histogram gives a quite smooth exponential-looking curve if plotted.
>
> Roger Stafford


This all makes sense. Thanks! Just to clarify to others....I need to rewrite this code in C++, so I can't use any mathworks specific functions.....

Thanks



Rick

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us