Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Random number choice according to probability distribution

Subject: Random number choice according to probability distribution

From: Emerson

Date: 6 Mar, 2012 20:45:17

Message: 1 of 8

Hi,
I would appreciate your help on this problem:

I have one vector with 19 positions, like this:
s = [ 1.0000 1.5000 2.0000 2.5000 3.0000 3.5000 4.0000 4.5000 5.0000 5.5000 6.0000 6.5000 7.0000 7.5000 8.0000 8.5000 9.0000 9.5000 10.0000 ]

and I also have one vector with 19 values of a probability function. This function is calculed from a weight vector 'w', so

p = w/sum(w)

The resulting vetor is:

p = [ 0.0848 0.0843 0.0829 0.0807 0.0776 0.0738 0.0695 0.0646 0.0595 0.0541 0.0487 0.0434 0.0382 0.0332 0.0286 0.0244 0.0205 0.0171 0.0141 ]

Ok, so the problem is: I need to randomly choose just one value from vector 's', but this choice must be conditioned to the probability vector 'p', i.e., values like 1.0, 1.5 or 2.0 have more probability to be chosen than values like 9.5 and 10.0, for an example.

Which solution do you suggest me?

Thanks.
Emerson

Subject: Random number choice according to probability distribution

From: TideMan

Date: 6 Mar, 2012 21:56:24

Message: 2 of 8

On Wednesday, March 7, 2012 9:45:17 AM UTC+13, Emerson wrote:
> Hi,
> I would appreciate your help on this problem:
>
> I have one vector with 19 positions, like this:
> s = [ 1.0000 1.5000 2.0000 2.5000 3.0000 3.5000 4.0000 4.5000 5.0000 5.5000 6.0000 6.5000 7.0000 7.5000 8.0000 8.5000 9.0000 9.5000 10.0000 ]
>
> and I also have one vector with 19 values of a probability function. This function is calculed from a weight vector 'w', so
>
> p = w/sum(w)
>
> The resulting vetor is:
>
> p = [ 0.0848 0.0843 0.0829 0.0807 0.0776 0.0738 0.0695 0.0646 0.0595 0.0541 0.0487 0.0434 0.0382 0.0332 0.0286 0.0244 0.0205 0.0171 0.0141 ]
>
> Ok, so the problem is: I need to randomly choose just one value from vector 's', but this choice must be conditioned to the probability vector 'p', i.e., values like 1.0, 1.5 or 2.0 have more probability to be chosen than values like 9.5 and 10.0, for an example.
>
> Which solution do you suggest me?
>
> Thanks.
> Emerson

You have the PDF.
Now you need to integrate that using cumsum to get the CDF.
Then, you take samples from the CDF using rand.

Search the newsgroup for "empirical CDF"

On Wednesday, March 7, 2012 9:45:17 AM UTC+13, Emerson wrote:
> Hi,
> I would appreciate your help on this problem:
>
> I have one vector with 19 positions, like this:
> s = [ 1.0000 1.5000 2.0000 2.5000 3.0000 3.5000 4.0000 4.5000 5.0000 5.5000 6.0000 6.5000 7.0000 7.5000 8.0000 8.5000 9.0000 9.5000 10.0000 ]
>
> and I also have one vector with 19 values of a probability function. This function is calculed from a weight vector 'w', so
>
> p = w/sum(w)
>
> The resulting vetor is:
>
> p = [ 0.0848 0.0843 0.0829 0.0807 0.0776 0.0738 0.0695 0.0646 0.0595 0.0541 0.0487 0.0434 0.0382 0.0332 0.0286 0.0244 0.0205 0.0171 0.0141 ]
>
> Ok, so the problem is: I need to randomly choose just one value from vector 's', but this choice must be conditioned to the probability vector 'p', i.e., values like 1.0, 1.5 or 2.0 have more probability to be chosen than values like 9.5 and 10.0, for an example.
>
> Which solution do you suggest me?
>
> Thanks.
> Emerson



On Wednesday, March 7, 2012 9:45:17 AM UTC+13, Emerson wrote:
> Hi,
> I would appreciate your help on this problem:
>
> I have one vector with 19 positions, like this:
> s = [ 1.0000 1.5000 2.0000 2.5000 3.0000 3.5000 4.0000 4.5000 5.0000 5.5000 6.0000 6.5000 7.0000 7.5000 8.0000 8.5000 9.0000 9.5000 10.0000 ]
>
> and I also have one vector with 19 values of a probability function. This function is calculed from a weight vector 'w', so
>
> p = w/sum(w)
>
> The resulting vetor is:
>
> p = [ 0.0848 0.0843 0.0829 0.0807 0.0776 0.0738 0.0695 0.0646 0.0595 0.0541 0.0487 0.0434 0.0382 0.0332 0.0286 0.0244 0.0205 0.0171 0.0141 ]
>
> Ok, so the problem is: I need to randomly choose just one value from vector 's', but this choice must be conditioned to the probability vector 'p', i.e., values like 1.0, 1.5 or 2.0 have more probability to be chosen than values like 9.5 and 10.0, for an example.
>
> Which solution do you suggest me?
>
> Thanks.
> Emerson



On Wednesday, March 7, 2012 9:45:17 AM UTC+13, Emerson wrote:
> Hi,
> I would appreciate your help on this problem:
>
> I have one vector with 19 positions, like this:
> s = [ 1.0000 1.5000 2.0000 2.5000 3.0000 3.5000 4.0000 4.5000 5.0000 5.5000 6.0000 6.5000 7.0000 7.5000 8.0000 8.5000 9.0000 9.5000 10.0000 ]
>
> and I also have one vector with 19 values of a probability function. This function is calculed from a weight vector 'w', so
>
> p = w/sum(w)
>
> The resulting vetor is:
>
> p = [ 0.0848 0.0843 0.0829 0.0807 0.0776 0.0738 0.0695 0.0646 0.0595 0.0541 0.0487 0.0434 0.0382 0.0332 0.0286 0.0244 0.0205 0.0171 0.0141 ]
>
> Ok, so the problem is: I need to randomly choose just one value from vector 's', but this choice must be conditioned to the probability vector 'p', i.e., values like 1.0, 1.5 or 2.0 have more probability to be chosen than values like 9.5 and 10.0, for an example.
>
> Which solution do you suggest me?
>
> Thanks.
> Emerson

You have the PDF.
Now you need to integrate p using cumsum to get the CDF, P - it will go from 0 to 1.
Then, you take samples from the CDF, s(P), using rand to determine random P and interp1 to return the corresponding s.

Search the newsgroup for "empirical CDF" for various posts on this.

Subject: Random number choice according to probability distribution

From: TideMan

Date: 6 Mar, 2012 22:00:00

Message: 3 of 8

Sorry about the mess in my previous post.

I'm struggling with the new version of Google Groups.............

Subject: Random number choice according to probability distribution

From: Roger Stafford

Date: 6 Mar, 2012 22:26:18

Message: 4 of 8

"Emerson" wrote in message <jj5t0t$d5f$1@newscl01ah.mathworks.com>...
> Hi,
> I would appreciate your help on this problem:
>
> I have one vector with 19 positions, like this:
> s = [ 1.0000 1.5000 2.0000 2.5000 3.0000 3.5000 4.0000 4.5000 5.0000 5.5000 6.0000 6.5000 7.0000 7.5000 8.0000 8.5000 9.0000 9.5000 10.0000 ]
>
> and I also have one vector with 19 values of a probability function. This function is calculed from a weight vector 'w', so
>
> p = w/sum(w)
>
> The resulting vetor is:
>
> p = [ 0.0848 0.0843 0.0829 0.0807 0.0776 0.0738 0.0695 0.0646 0.0595 0.0541 0.0487 0.0434 0.0382 0.0332 0.0286 0.0244 0.0205 0.0171 0.0141 ]
>
> Ok, so the problem is: I need to randomly choose just one value from vector 's', but this choice must be conditioned to the probability vector 'p', i.e., values like 1.0, 1.5 or 2.0 have more probability to be chosen than values like 9.5 and 10.0, for an example.
>
> Which solution do you suggest me?
>
> Thanks.
> Emerson
- - - - - - - - - -
  Use [0,cumsum(p)] as the "edges" (2nd input) in 'histc' with rand(1,n) as first input. (Fudge the last edge value up by a few 'eps' to avoid getting the final 'edge'.) Then the 2nd output ('bin') can be used as n random indices into s.

  Be sure to read the documentation for 'histc' carefully to get it right.

Roger Stafford

Subject: Random number choice according to probability distribution

From: Emerson

Date: 7 Mar, 2012 11:54:20

Message: 5 of 8

Hi Roger,
I've already tried this, following your sugestion from an older message in this forum. My code is like this:

c = cumsum(p);
r = rand(1,1);
e = [0,c];
[~,bin] = histc(r,e);
x = s(bin);

But running the routine several times, I noticed that sometimes the distribution of the random choices doesn't follow the probability distribution. For example, sometimes the 2nd or the 3rd values are chosen more frequently than the 1st value of vector 's'. I'm expecting something like

nc_s(1) > nc_s(2) > nc_s(3) > ... > nc_s(19)

and instead of this, sometimes I have an output like

nc_s(2) > nc_s(1) > nc_s(3) > ... > nc_s(19)

(where nc_p(k) represents how many times the k-th value from vector 's' has ben chosen).
Am I doing something wrong?

Thanks.



"Roger Stafford" wrote in message <jj62ua$432$1@newscl01ah.mathworks.com>...
> - - - - - - - - - -
> Use [0,cumsum(p)] as the "edges" (2nd input) in 'histc' with rand(1,n) as first input. (Fudge the last edge value up by a few 'eps' to avoid getting the final 'edge'.) Then the 2nd output ('bin') can be used as n random indices into s.
>
> Be sure to read the documentation for 'histc' carefully to get it right.
>
> Roger Stafford

Subject: Random number choice according to probability distribution

From: Torsten

Date: 7 Mar, 2012 12:05:01

Message: 6 of 8

On 7 Mrz., 12:54, "Emerson " <emerson1983...@gmail.com> wrote:
> Hi Roger,
> I've already tried this, following your sugestion from an older message in this forum. My code is like this:
>
> c = cumsum(p);
> r = rand(1,1);
> e = [0,c];
> [~,bin] = histc(r,e);
> x = s(bin);
>
> But running the routine several times, I noticed that sometimes the distribution of the random choices doesn't follow the probability distribution. For example, sometimes the 2nd or the 3rd values are chosen more frequently than the 1st value of vector 's'. I'm expecting something like
>
> nc_s(1) > nc_s(2) > nc_s(3) > ... > nc_s(19)
>
> and instead of this, sometimes I have an output like
>
> nc_s(2) > nc_s(1) > nc_s(3) > ... > nc_s(19)
>
> (where nc_p(k) represents how many times the k-th value from vector 's' has ben chosen).
> Am I doing something wrong?
>
> Thanks.
>
>
>
> "Roger Stafford" wrote in message <jj62ua$43...@newscl01ah.mathworks.com>...
> > - - - - - - - - - -
> >   Use [0,cumsum(p)] as the "edges" (2nd input) in 'histc' with rand(1,n) as first input.  (Fudge the last edge value up by a few 'eps' to avoid getting the final 'edge'.)  Then the 2nd output ('bin') can be used as n random indices into s.
>
> >   Be sure to read the documentation for 'histc' carefully to get it right.
>
> > Roger Stafford- Zitierten Text ausblenden -
>
> - Zitierten Text anzeigen -

And how big is
sum_{i=1}^{19} nc_s(i)
in your attempt to reproduce the above probability distribution ?
It should be quite large in order to see the difference between a
probability of 0.0848 and a probability of 0.0843 ...

Best wishes
Torsten.

Subject: Random number choice according to probability distribution

From: Roger Stafford

Date: 7 Mar, 2012 20:53:12

Message: 7 of 8

"Emerson" wrote in message <jj7i9c$l0s$1@newscl01ah.mathworks.com>...
> Hi Roger,
> I've already tried this, following your sugestion from an older message in this forum. My code is like this:
>
> c = cumsum(p);
> r = rand(1,1);
> e = [0,c];
> [~,bin] = histc(r,e);
> x = s(bin);
>
> But running the routine several times, I noticed that sometimes the distribution of the random choices doesn't follow the probability distribution. For example, sometimes the 2nd or the 3rd values are chosen more frequently than the 1st value of vector 's'. I'm expecting something like
>
> nc_s(1) > nc_s(2) > nc_s(3) > ... > nc_s(19)
>
> and instead of this, sometimes I have an output like
>
> nc_s(2) > nc_s(1) > nc_s(3) > ... > nc_s(19)
>
> (where nc_p(k) represents how many times the k-th value from vector 's' has ben chosen).
> Am I doing something wrong?
>
> Thanks.
- - - - - - - - -
  Emerson, you might be surprised at how large your sample size would have to be in order to be reasonably sure that the first element of 's' whose probability is 0.0848 occurs more often than the second element whose probability is a smaller 0.0843 . To simplify the calculation I eliminated the other 17 elements and calculated such a probability with two elements with probabilities in the same ratio: 0.0848/(0.0848+0.0843) and 0.0843/(0.0848+0.0843). After 8190 samples the probability is still about 0.4 that the second, less likely, one will nevertheless be in the majority! With all 19 present you would have to take a far greater number of samples than this to even achieve this same result. That means for it to be fairly unlikely to ever see such a reversal of these particular two elements among your 19 you would have to have hundreds of thousands or possibly millions of samples.

Roger Stafford

Subject: Random number choice according to probability distribution

From: Germán

Date: 8 Mar, 2012 00:01:42

Message: 8 of 8

If you want to get just ONE random sample drawn from 's' according to the normalized weights in 'p' you can simply use:

drawn = randsample( s, 1, true, p)

but I think you may want a whole new re-sampled version of s, then you can do:

Snew = randsample( s, length(s), true, p)

Another option could be the bootstrap function:

[~,idx] = bootstrp(1, [], 1:length(p), 'Weights', p);
Snew = s(idx)

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us