Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Random sampling with constraints

Subject: Random sampling with constraints

From: Siva

Date: 15 Apr, 2012 03:09:17

Message: 1 of 14

Hi -

I have a random [ 1 x m] vector of unique integers P. I need to randomly generate a [ 1 x n] vector S that is a subset of P that satisfies the requirement that the difference between any two integers in S is greater then "d".

For example if I start with a [ 1 x 7] integer vector P= [ 23 2 25 7 15 13 8 16], and "d" is 8, then one possible random [ 1 x 3] integer vector that satisfies my "difference constraint" is [ 23 2 13]. But neither [7 25 2] nor [ 25 23 2] are valid since the "difference constraint" is not satisfied with either one of them.

Here is what I do now:

% Parent random integer vector
P= [ 23 2 25 7 15 13 8 16] ;
d= 8 ;

% Preliminaries
n= 3 ; % desired number of integers in vector S
Q= sort( P) ;
S= Q( 1) ; % select the first integer
nelements= 1 ; % number of integers found so far

% Loop until the desired number of integers are identified
for i= 2:length(Q)
  if ( Q(i)-S(end)) > d
     S= [ S Q( i)] ; nelements= nelements+1 ;
     if nelements >= n, break, end % break out when the desired number of
                                                % elements is found
  end
end

This code generates a solution. But I believe the random vector S is 'biased' because of the sort and subsequent selection of the first element in the sorted vector as the first entry of S. I would appreciate any comments and/or suggestions.

Thanks.
Siva

Subject: Random sampling with constraints

From: Bruno Luong

Date: 15 Apr, 2012 10:24:07

Message: 2 of 14

"Siva " <sivaathome@gmail.com> wrote in message
>
> This code generates a solution. But I believe the random vector S is 'biased' because of the sort and subsequent selection of the first element in the sorted vector as the first entry of S. I would appreciate any comments and/or suggestions.

Yes, those ways of doing are certainly introduce a bias. I suggest you perform the for-loop, during which you should keep track of exclusion intervals on the current list of elements and *randomly* pick an element (among the remaining) outside.

Bruno

Subject: Random sampling with constraints

From: Roger Stafford

Date: 15 Apr, 2012 17:09:07

Message: 3 of 14

"Siva " <sivaathome@gmail.com> wrote in message <jmde4t$hd4$1@newscl01ah.mathworks.com>...
> I have a random [ 1 x m] vector of unique integers P. I need to randomly generate a [ 1 x n] vector S that is a subset of P that satisfies the requirement that the difference between any two integers in S is greater then "d".
- - - - - - - - - -
  If you wish to strictly avoid all bias in your random selection of S, I think in a problem of this nature it might very well be necessary to compute a list of all possible solutions and then randomly select one of them from that list. If P and S are large and d is small, doing so could lead to an immense list of possible solutions, so this would be a rather desperate undertaking. It is important that you perform the initial sorting operation. Having done that I can envision a recursion process that has depth n (the number of elements to be placed in S) in which all possible solutions are found without repetition. Are you sufficiently determined in avoiding bias to undertake such a method, or would some simpler heuristic method be adequate?

Roger Stafford

Subject: Random sampling with constraints

From: Siva

Date: 16 Apr, 2012 02:35:07

Message: 4 of 14

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <jme7k7$nn8$1@newscl01ah.mathworks.com>...
> "Siva " <sivaathome@gmail.com> wrote in message
> >
> > This code generates a solution. But I believe the random vector S is 'biased' because of the sort and subsequent selection of the first element in the sorted vector as the first entry of S. I would appreciate any comments and/or suggestions.
>
> Yes, those ways of doing are certainly introduce a bias. I suggest you perform the for-loop, during which you should keep track of exclusion intervals on the current list of elements and *randomly* pick an element (among the remaining) outside.
>
> Bruno

Bruno.

If I understand your idea, on each loop I would randomly select from the elements in P that satisfy the constraint. I believe this should work. I will try it out. Thanks.

Siva

Subject: Random sampling with constraints

From: Siva

Date: 16 Apr, 2012 02:49:06

Message: 5 of 14

"Roger Stafford" wrote in message <jmevbj$o98$1@newscl01ah.mathworks.com>...
> "Siva " <sivaathome@gmail.com> wrote in message <jmde4t$hd4$1@newscl01ah.mathworks.com>...
> > I have a random [ 1 x m] vector of unique integers P. I need to randomly generate a [ 1 x n] vector S that is a subset of P that satisfies the requirement that the difference between any two integers in S is greater then "d".
> - - - - - - - - - -
> If you wish to strictly avoid all bias in your random selection of S, I think in a problem of this nature it might very well be necessary to compute a list of all possible solutions and then randomly select one of them from that list. If P and S are large and d is small, doing so could lead to an immense list of possible solutions, so this would be a rather desperate undertaking. It is important that you perform the initial sorting operation. Having done that I can envision a recursion process that has depth n (the number of elements to be placed in S) in which all possible solutions are found without repetition. Are you sufficiently determined in avoiding bias to undertake such a method, or would some simpler heuristic method be adequate?
>
> Roger Stafford

Roger -

To give you a sense of the scale of the problem, the vector P is [1 x 100000], S is [ 1 x 100], and d is about 50. I need the construction of S to be strictly random. If I understand your idea, with the problem size I am dealing with, identifying the set of all possible solutions may be a challenge even if it has to be created only once. What is your feeling about the tractability for my situation? If I cannot find a tractable solution, I will have to resort to a heuristic. Any ideas?

Thanks.
Siva

Subject: Random sampling with constraints

From: Roger Stafford

Date: 16 Apr, 2012 03:39:07

Message: 6 of 14

"Siva " <sivaathome@gmail.com> wrote in message <jmg1b2$4gd$1@newscl01ah.mathworks.com>...
> To give you a sense of the scale of the problem, the vector P is [1 x 100000], S is [ 1 x 100], and d is about 50.
- - - - - - - -
  With the length of P you quote, the number of possible selections for S will certainly be too large for the method I mentioned. On the other hand, that large size makes the idea Bruno mentioned far more effective than it would be for small sized P. I think you can achieve a reasonably random selection of S using his heuristic method.

Roger Stafford

Subject: Random sampling with constraints

From: David Epstein

Date: 16 Apr, 2012 11:10:06

Message: 7 of 14

"Roger Stafford" wrote in message <jmg48r$ffs$1@newscl01ah.mathworks.com>...
I think you can achieve a reasonably random selection of S using his heuristic method.

Roger: do have a reason for the adjective "heuristic"? Probably I'm missing something, but it seems to me that it should be easy to prove that Bruno's answer is correct, on the basis of the incorrect assumption that Matlab's random number generator is truly random.
David

Subject: Random sampling with constraints

From: Bruno Luong

Date: 16 Apr, 2012 11:21:09

Message: 8 of 14

"David Epstein" <David.Epstein.spam@remove.warwick.ac.uk> wrote in message <jmgume$q8d$1@newscl01ah.mathworks.com>...
>
> Roger: do have a reason for the adjective "heuristic"? Probably I'm missing something, but it seems to me that it should be easy to prove that Bruno's answer is correct, on the basis of the incorrect assumption that Matlab's random number generator is truly random.

Go ahead David. But I'm with Roger, it won't be easy to prove or disprove that my method provides a uniform distribution (it might be good enough for your purpose though).

Bruno

Subject: Random sampling with constraints

From: Siva

Date: 16 Apr, 2012 15:04:06

Message: 9 of 14

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <jmgvb5$t2s$1@newscl01ah.mathworks.com>...
> "David Epstein" <David.Epstein.spam@remove.warwick.ac.uk> wrote in message <jmgume$q8d$1@newscl01ah.mathworks.com>...
> >
> > Roger: do have a reason for the adjective "heuristic"? Probably I'm missing something, but it seems to me that it should be easy to prove that Bruno's answer is correct, on the basis of the incorrect assumption that Matlab's random number generator is truly random.
>
> Go ahead David. But I'm with Roger, it won't be easy to prove or disprove that my method provides a uniform distribution (it might be good enough for your purpose though).
>
> Bruno

Thanks Bruno, Roger and David for your input and feedback. I have implemented the approach that Bruno suggested because of the computational effort involved in trying Roger's suggestion for my use case. It is adequate for my purpose.

Thanks once again to everybody for taking the time to give me advice and make suggestions.

Siva

Subject: Random sampling with constraints

From: Roger Stafford

Date: 16 Apr, 2012 16:54:10

Message: 10 of 14

"David Epstein" <David.Epstein.spam@remove.warwick.ac.uk> wrote in message <jmgume$q8d$1@newscl01ah.mathworks.com>...
> "Roger Stafford" wrote in message <jmg48r$ffs$1@newscl01ah.mathworks.com>...
> I think you can achieve a reasonably random selection of S using his heuristic method.
>
> Roger: do have a reason for the adjective "heuristic"? Probably I'm missing something, but it seems to me that it should be easy to prove that Bruno's answer is correct, on the basis of the incorrect assumption that Matlab's random number generator is truly random.
> David
- - - - - - - - - -
  David, I can offer a simple counterexample showing that Bruno's method does not always give equal probabilities to all the possible solutions for S. I do agree with him that his method is very likely to be good enough for Siva's problem, given the size of Siva's arrays.

  As a counterexample, let P = [1 2 3 5], d = 1.5, and n = 2. Only the solutions S = [1 3], [1 5], [2 5], and [3 5] are possible. However the respective probabilities of Bruno selecting these pairs are: 6/24, 5/24, 8/24, and 5/24, which are decidedly unequal. For example, the probability is 1/4 the first one chosen will be 2, but then it is certain that 5 will have to be the next chosen. However if 5 is chosen first there is a further one out of three chance of selecting the 2, giving another 1/12 probability. Hence the total for the pair (2 5] would be 1/4+1/12 = 1/3 and not the 1/4 that would be required for an equal distribution.

  It is a certainty that such inequalities will also be found in larger P, d, and n cases.

Roger Stafford

Subject: Random sampling with constraints

From: Siva

Date: 16 Apr, 2012 18:12:05

Message: 11 of 14

"Roger Stafford" wrote in message <jmhiri$ov4$1@newscl01ah.mathworks.com>...
> "David Epstein" <David.Epstein.spam@remove.warwick.ac.uk> wrote in message <jmgume$q8d$1@newscl01ah.mathworks.com>...
> > "Roger Stafford" wrote in message <jmg48r$ffs$1@newscl01ah.mathworks.com>...
> > I think you can achieve a reasonably random selection of S using his heuristic method.
> >
> > Roger: do have a reason for the adjective "heuristic"? Probably I'm missing something, but it seems to me that it should be easy to prove that Bruno's answer is correct, on the basis of the incorrect assumption that Matlab's random number generator is truly random.
> > David
> - - - - - - - - - -
> David, I can offer a simple counterexample showing that Bruno's method does not always give equal probabilities to all the possible solutions for S. I do agree with him that his method is very likely to be good enough for Siva's problem, given the size of Siva's arrays.
>
> As a counterexample, let P = [1 2 3 5], d = 1.5, and n = 2. Only the solutions S = [1 3], [1 5], [2 5], and [3 5] are possible. However the respective probabilities of Bruno selecting these pairs are: 6/24, 5/24, 8/24, and 5/24, which are decidedly unequal. For example, the probability is 1/4 the first one chosen will be 2, but then it is certain that 5 will have to be the next chosen. However if 5 is chosen first there is a further one out of three chance of selecting the 2, giving another 1/12 probability. Hence the total for the pair (2 5] would be 1/4+1/12 = 1/3 and not the 1/4 that would be required for an equal distribution.
>
> It is a certainty that such inequalities will also be found in larger P, d, and n cases.
>
> Roger Stafford

Roger -
My fault for not making it clear but "d" is an integer. Would that make a difference to your analysis?
Thanks.
Siva

Subject: Random sampling with constraints

From: Bruno Luong

Date: 16 Apr, 2012 18:34:09

Message: 12 of 14

"Siva " <sivaathome@gmail.com> wrote in message <jmhndl$edq$1@newscl01ah.mathworks.com>...
=
>
> Roger -
> My fault for not making it clear but "d" is an integer. Would that make a difference to your analysis?

I answer for Roger, but non it should not make any difference because for 'a' an integer array

{ i,j : | a(i) - a(j) | < 1.5 } = { i,j : | a(i) - a(j) | < 2 }.

So the feasible set is the same whereas d=1.5 or d=2.

Bruno

Subject: Random sampling with constraints

From: Roger Stafford

Date: 16 Apr, 2012 20:03:09

Message: 13 of 14

"Siva " <sivaathome@gmail.com> wrote in message <jmhndl$edq$1@newscl01ah.mathworks.com>...
> My fault for not making it clear but "d" is an integer. Would that make a difference to your analysis?
- - - - - - - - - -
  No, just multiplying P and d by 2 with P = [2 4 6 10] and d = 3 gives the same probabilities. Actually P can be evenly spaced as in P = [2 4 6 8], d = 3, and n = 2. The three possible S vectors are S = [2 6], [4 8] and [2 8], but their respective probabilities are: 3/8, 3/8, and 2/8. Counterexamples are easier to find than those with uniform distributions!

Roger Stafford

Subject: Random sampling with constraints

From: Siva

Date: 16 Apr, 2012 20:20:08

Message: 14 of 14

"Roger Stafford" wrote in message <jmhttt$bm7$1@newscl01ah.mathworks.com>...
> "Siva " <sivaathome@gmail.com> wrote in message <jmhndl$edq$1@newscl01ah.mathworks.com>...
> > My fault for not making it clear but "d" is an integer. Would that make a difference to your analysis?
> - - - - - - - - - -
> No, just multiplying P and d by 2 with P = [2 4 6 10] and d = 3 gives the same probabilities. Actually P can be evenly spaced as in P = [2 4 6 8], d = 3, and n = 2. The three possible S vectors are S = [2 6], [4 8] and [2 8], but their respective probabilities are: 3/8, 3/8, and 2/8. Counterexamples are easier to find than those with uniform distributions!
>
> Roger Stafford

Roger -
Of course! I should have seen that.
Siva

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us