Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
k (3) Vectors of ones and zeros "combining" together

Subject: k (3) Vectors of ones and zeros "combining" together

From: Apostol

Date: 17 Jan, 2012 23:42:08

Message: 1 of 23

Hello all!
I would like to generate 3 vectors of a specified length, say N, containing ones and zeros with these constraints:

-all 3 vectors have at least a 1
-the sum of these 3 vectors gives ones(1,N), a vector with all ones


*this is the only way I think I have for exploring all the possible combinations of a given set of N elements in 3 groups

Thank you!

Subject: k (3) Vectors of ones and zeros "combining" together

From: Matt J

Date: 17 Jan, 2012 23:55:09

Message: 2 of 23

"Apostol" wrote in message <jf510f$nv5$1@newscl01ah.mathworks.com>...
> Hello all!
> I would like to generate 3 vectors of a specified length, say N, containing ones and zeros with these constraints:
>
> -all 3 vectors have at least a 1
> -the sum of these 3 vectors gives ones(1,N), a vector with all ones
================


A=eye(3,N);
A(3,4:end)=1;
vector1=A(1,:);
vector2=A(2,:);
vector3=A(3,:);

Subject: k (3) Vectors of ones and zeros "combining" together

From: Apostol

Date: 18 Jan, 2012 00:16:09

Message: 3 of 23

"Matt J" wrote in message <jf51ot$q13$1@newscl01ah.mathworks.com>...

>
> A=eye(3,N);
> A(3,4:end)=1;
> vector1=A(1,:);
> vector2=A(2,:);
> vector3=A(3,:);

Hi Matt, thanks for you answer!

but what if I need DIFFERENT combinations? I know for large values of N would be really long/difficult but I would need something that each time generates different combinations so I can evaluate different clusters.

Thanks :)

Subject: k (3) Vectors of ones and zeros "combining" together

From: Roger Stafford

Date: 18 Jan, 2012 02:14:08

Message: 4 of 23

"Apostol" wrote in message <jf5309$a8$1@newscl01ah.mathworks.com>...
> but what if I need DIFFERENT combinations? I know for large values of N would be really long/difficult but I would need something that each time generates different combinations so I can evaluate different clusters.
>
> Thanks :)
- - - - - - - - - -
  This ought to make all possible combinations possible, and they are of equal probability:

 A = zeros(3,n);
 p = randperm(N);
 q = [1:3,ceil(3*rand(1,N-3))];
 A(q+3*(p-1)) = 1;

The three rows of A are your three vectors. Each row has at least one 1 and each column has exactly one 1.

  (Note that each time you restart your matlab application it will generate the same sequence of pseudo random numbers unless you use 'randstream' methods.)

Roger Stafford

Subject: k (3) Vectors of ones and zeros "combining" together

From: Apostol

Date: 18 Jan, 2012 02:36:08

Message: 5 of 23

Thank You very much Roger, I'm trying to use the code you gave me and it seems to work pretty well :)


"Roger Stafford" wrote in message <jf59tg$jbs$1@newscl01ah.mathworks.com>...
> - - - - - - - - - -
> This ought to make all possible combinations possible, and they are of equal probability:
>
> A = zeros(3,n);
> p = randperm(N);
> q = [1:3,ceil(3*rand(1,N-3))];
> A(q+3*(p-1)) = 1;
>
> The three rows of A are your three vectors. Each row has at least one 1 and each column has exactly one 1.
>
> (Note that each time you restart your matlab application it will generate the same sequence of pseudo random numbers unless you use 'randstream' methods.)
>
> Roger Stafford

Subject: k (3) Vectors of ones and zeros "combining" together

From: Roger Stafford

Date: 18 Jan, 2012 07:13:09

Message: 6 of 23

> "Roger Stafford" wrote in message <jf59tg$jbs$1@newscl01ah.mathworks.com>...
> > .......
> > This ought to make all possible combinations possible, and they are of equal probability:
> > .......
- - - - - - - -
  I retract the claim I made earlier that with this algorithm the various possible combinations will occur with equal probabilities. A large sampling showed that they are not actually equally likely, some types of combinations inherently occurring with greater frequency than others. However all are possible and do occur frequently. Hopefully that is all you need in your work.

Roger Stafford

Subject: k (3) Vectors of ones and zeros "combining" together

From: Bruno Luong

Date: 18 Jan, 2012 07:19:08

Message: 7 of 23

"Roger Stafford" wrote in message <jf5re5$9lp$1@newscl01ah.mathworks.com>...

> I retract the claim I made earlier that with this algorithm the various possible combinations will occur with equal probabilities. A large sampling showed that they are not actually equally likely, some types of combinations inherently occurring with greater frequency than others.

Can you elaborate Roger? It can't feel how the non-equal probability would be possible.

Bruno

Subject: k (3) Vectors of ones and zeros "combining" together

From: Roger Stafford

Date: 18 Jan, 2012 09:07:08

Message: 8 of 23

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <jf5rpc$ah4$1@newscl01ah.mathworks.com>...
> "Roger Stafford" wrote in message <jf5re5$9lp$1@newscl01ah.mathworks.com>...
>
> > I retract the claim I made earlier that with this algorithm the various possible combinations will occur with equal probabilities. A large sampling showed that they are not actually equally likely, some types of combinations inherently occurring with greater frequency than others.
>
> Can you elaborate Roger? It can't feel how the non-equal probability would be possible.
>
> Bruno
- - - - - - - - - -
  Like you Bruno, initially my intuition was for equal probabilities. However, after recklessly posting the claim that this is so, I wasn't able to think up a rigorous proof of it and gradually began to doubt its validity. Finally as an act of desperation I ran some empirical tests with millions of samples and found out that, sure enough, the claim simply isn't true. Blush!

  For example, for N = 5, there are two ways of partitioning the five ones into three non-empty subsets, 1 + 1 + 3 and 1 + 2 + 2. There are 60 of the first kind and 90 of the second kind. The probability of each one of the 60 seems to be 1/180 and of the second kind 1/135, adding up to 60/180+90/135=1, (or if it isn't these, they are close.) For larger values of N there are more partition types and thus more distinct probability levels, as is quite evident by their distribution curves.

  I am sure that the theoretical justification of this is there to be found but I am too sleepy to work on it any further tonight.

Roger Stafford

Subject: k (3) Vectors of ones and zeros "combining" together

From: Roger Stafford

Date: 18 Jan, 2012 12:29:08

Message: 9 of 23

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <jf5rpc$ah4$1@newscl01ah.mathworks.com>...
> Can you elaborate Roger? It can't feel how the non-equal probability would be possible.
- - - - - - - - - -
  Hi Bruno. Now I can say with certainty that the probabilities are _not_ equal. In the case of N = 5, with the algorithm I used, randperm will select each of the 120 permutations with equal probability. However there are only 60 permutations involving the first 3 elements. A permutation (swap) between the last 2 elements does not affect the statistics of the ceil(3*rand(1,N-3)) action, so it would be equivalent to assume that we are using only the first three elements of randperm and taking the remaining 2 in ascending order. For each of these 60 permutations then, there are 9 possible outcomes of ceil(3*rand(1,N-3)), of which 3 choose the same value and 6 choose different values. That means of a total of 60*9 equally likely events, 60*3 will be distributed equally over 60 known partitions of the 1+1+3 type and 60*6 will be distributed equally over 90 known partitions of the 1+2+2
type. This gives a 1/180 probability for each of the first type and 1/135 for each of the second type. The sampling runs I made earlier did in fact indicate the correct probabilities. QED

  For larger N this computation becomes increasingly complicated, so I'll quit now while I'm ahead.

  All of this adds up to the conclusion that this algorithm does not quite achieve the perfection I had in mind for it when I devised it. ;-(

Roger Stafford

Subject: k (3) Vectors of ones and zeros "combining" together

From: Bruno Luong

Date: 18 Jan, 2012 13:02:08

Message: 10 of 23

"Roger Stafford" wrote in message <jf6duk$2gh$1@newscl01ah.mathworks.com>...
> "Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <jf5rpc$ah4$1@newscl01ah.mathworks.com>...
> > Can you elaborate Roger? It can't feel how the non-equal probability would be possible.
> - - - - - - - - - -
> Hi Bruno. Now I can say with certainty that the probabilities are _not_ equal. In the case of N = 5, with the algorithm I used, randperm will select each of the 120 permutations with equal probability. However there are only 60 permutations involving the first 3 elements. A permutation (swap) between the last 2 elements does not affect the statistics of the ceil(3*rand(1,N-3)) action, so it would be equivalent to assume that we are using only the first three elements of randperm and taking the remaining 2 in ascending order. For each of these 60 permutations then, there are 9 possible outcomes of ceil(3*rand(1,N-3)), of which 3 choose the same value and 6 choose different values. That means of a total of 60*9 equally likely events, 60*3 will be distributed equally over 60 known partitions of the 1+1+3 type and 60*6 will be distributed equally over 90 known partitions of the
1+2+2
> type. This gives a 1/180 probability for each of the first type and 1/135 for each of the second type. The sampling runs I made earlier did in fact indicate the correct probabilities. QED

I see. Thank Roger for showing the drawback.

Perhaps that analysis shows the way to fix the original algorithm.

Bruno

Subject: k (3) Vectors of ones and zeros "combining" together

From: Matt J

Date: 18 Jan, 2012 15:05:09

Message: 11 of 23

"Apostol" wrote in message <jf5309$a8$1@newscl01ah.mathworks.com>...
>
> Hi Matt, thanks for you answer!
>
> but what if I need DIFFERENT combinations? I know for large values of N would be really long/difficult but I would need something that each time generates different combinations so I can evaluate different clusters.
===============


Here's my next proposal. I've taken trouble neither to optimize it nor to verify that the outcomes are uniformly distributed, but my instinct tells me it should be.


N=10;
A=zeros(3,N);


%%Gaurantee at least one 1 in each row via sampling w/o replacement
Nset=1:N;
for ii=N:-1:N-2
 Nset(randi(ii,1))=[];
end

jj=setdiff(1:N,Nset);

for ii=1:3
  A(ii,jj(ii))=1;
end

%%Now fill zero-columns arbitrarily


idx=sub2ind([3,N],randi(3,[1,N-3]),Nset);
A(idx)=1;

Subject: k (3) Vectors of ones and zeros "combining" together

From: Matt J

Date: 18 Jan, 2012 18:07:11

Message: 12 of 23

"Matt J" wrote in message <jf6n35$3kn$1@newscl01ah.mathworks.com>...
> "Apostol" wrote in message <jf5309$a8$1@newscl01ah.mathworks.com>...
> >
> > Hi Matt, thanks for you answer!
> >
> > but what if I need DIFFERENT combinations? I know for large values of N would be really long/difficult but I would need something that each time generates different combinations so I can evaluate different clusters.
> ===============
>
>
> Here's my next proposal. I've taken trouble neither to optimize it nor to verify that the outcomes are uniformly distributed, but my instinct tells me it should be.
>
>
=================

Forget it. I now realize that my approach is equivalent to Roger's...

Subject: k (3) Vectors of ones and zeros "combining" together

From: Apostol

Date: 18 Jan, 2012 23:36:10

Message: 13 of 23

"Matt J" wrote in message <jf71oe$bbi$1@newscl01ah.mathworks.com>...
> "Matt J" wrote in message <jf6n35$3kn$1@newscl01ah.mathworks.com>...
> > "Apostol" wrote in message <jf5309$a8$1@newscl01ah.mathworks.com>...
> > >
> > > Hi Matt, thanks for you answer!
> > >
> > > but what if I need DIFFERENT combinations? I know for large values of N would be really long/difficult but I would need something that each time generates different combinations so I can evaluate different clusters.
> > ===============
> >
> >
> > Here's my next proposal. I've taken trouble neither to optimize it nor to verify that the outcomes are uniformly distributed, but my instinct tells me it should be.
> >
> >
> =================
>
> Forget it. I now realize that my approach is equivalent to Roger's...


Thank you very much guys! For small values of N this shouldn't be a problem, given that for my specific problem some groups have the same "value" of others... if it'll get tougher with a higher N perhaps I should consider other options.
Thank You Again! :)

Subject: k (3) Vectors of ones and zeros "combining" together

From: Matt J

Date: 19 Jan, 2012 18:28:10

Message: 14 of 23

"Apostol" wrote in message <jf7l1a$h8i$1@newscl01ah.mathworks.com>...
>
> > Forget it. I now realize that my approach is equivalent to Roger's...
>
>
> Thank you very much guys! For small values of N this shouldn't be a problem, given that for my specific problem some groups have the same "value" of others... if it'll get tougher with a higher N perhaps I should consider other options.
===================

Below is a method which I really do believe should give uniformly distributed results, but for some reason it is giving me some weird outliers in my histogram plots for N=5 and 1e6 samples. Possibly 1e6 is still insufficient...?


sets=nchoosek(1:N-1,2);
M=size(sets,1);
thisSet=sets(randi(M),:);
len1=thisSet(1);
len2=thisSet(2)-thisSet(1);
len3=N-(len1+len2);

A=[repmat([1;0;0],1,len1),...
   repmat([0;1;0],1,len2),...
   repmat([0;0;1],1,len3)];

A=A(:,randperm(N));

Subject: k (3) Vectors of ones and zeros "combining" together

From: Roger Stafford

Date: 22 Jan, 2012 19:36:24

Message: 15 of 23

"Matt J" wrote in message <jf9nbq$b9h$1@newscl01ah.mathworks.com>...
> Below is a method which I really do believe should give uniformly distributed results, but for some reason it is giving me some weird outliers in my histogram plots for N=5 and 1e6 samples. Possibly 1e6 is still insufficient...?
>
> sets=nchoosek(1:N-1,2);
> M=size(sets,1);
> thisSet=sets(randi(M),:);
> len1=thisSet(1);
> len2=thisSet(2)-thisSet(1);
> len3=N-(len1+len2);
> A=[repmat([1;0;0],1,len1),...
> repmat([0;1;0],1,len2),...
> repmat([0;0;1],1,len3)];
> A=A(:,randperm(N));
- - - - - - - - - -
  Matt, if I understand your code correctly, for N = 5 the 'randi' instruction gives the "1+1+3" types equal probability with the "1+2+2" types (that is, three of each) before permutations with 'randperm' occur. Out of 243 possible arrangements of A with one 1 in each column there are only 150 of them that also have at least one 1 in each row. Unfortunately, of these there are only of the 1+1+3 type as against 90 of the 1+2+2 type after permutations are done. Hence the latter are individually less probable than the former. You would need to have somehow made the initial probability assignments for these two types before permutation to be in the ratio of 2 to 3 for it to come out right.

  I think I can see how it might be done in the general case, but it would require a more extensive calculation.

Roger Stafford

Subject: k (3) Vectors of ones and zeros "combining" together

From: Roger Stafford

Date: 22 Jan, 2012 20:45:10

Message: 16 of 23

"Roger Stafford" wrote in message <jfhofo$l7e$1@newscl01ah.mathworks.com>...
> Matt, if I understand your code correctly, for N = 5 the 'randi' instruction gives the "1+1+3" types equal probability with the "1+2+2" types (that is, three of each) before permutations with 'randperm' occur. Out of 243 possible arrangements of A with one 1 in each column there are only 150 of them that also have at least one 1 in each row. Unfortunately, of these there are only of the 1+1+3 type as against 90 of the 1+2+2 type after permutations are done. Hence the latter are individually less probable than the former. You would need to have somehow made the initial probability assignments for these two types before permutation to be in the ratio of 2 to 3 for it to come out right.
- - - - - - - -
  The number '60' got lost in my text. It should read: "Unfortunately, of these there are only 60 of the 1+1+3 type as against 90 of the 1+2+2 type after permutations are done."

Roger Stafford

Subject: k (3) Vectors of ones and zeros "combining" together

From: Matt J

Date: 22 Jan, 2012 21:10:10

Message: 17 of 23

"Roger Stafford" wrote in message <jfhsgm$3bj$1@newscl01ah.mathworks.com>...
>
> The number '60' got lost in my text. It should read: "Unfortunately, of these there are only 60 of the 1+1+3 type as against 90 of the 1+2+2 type after permutations are done."
=====================

I'm failing to see why that's true, Roger. As far as I can tell, every matrix of the 1+1+3 type must be a permutation of the columns of
A113=[1 0 0 0 0; 0 1 0 0 0 ; 0 0 1 1 1]
in order for the rows to sum to 1. Hence there ought to be N! of them

Similarly, every matrix of the 1+2+2 type must be a permutation of the columns of
A122=[1 0 0 0 0; 0 1 1 0 0 ; 0 0 0 1 1]
There ought to be N! of them as well.

Subject: k (3) Vectors of ones and zeros "combining" together

From: Matt J

Date: 22 Jan, 2012 21:44:09

Message: 18 of 23

"Matt J" wrote in message <jfhtvi$7i7$1@newscl01ah.mathworks.com>...
> "Roger Stafford" wrote in message <jfhsgm$3bj$1@newscl01ah.mathworks.com>...
> >
> > The number '60' got lost in my text. It should read: "Unfortunately, of these there are only 60 of the 1+1+3 type as against 90 of the 1+2+2 type after permutations are done."
> =====================
>
> I'm failing to see why that's true, Roger. As far as I can tell, every matrix of the 1+1+3 type must be a permutation of the columns of
> A113=[1 0 0 0 0; 0 1 0 0 0 ; 0 0 1 1 1]
> in order for the rows to sum to 1. Hence there ought to be N! of them
=============

OK, I think I see now why that's not true. Reordering the final 3 columns produces no new distinct matrices. Hmmm....

Subject: k (3) Vectors of ones and zeros "combining" together

From: Matt J

Date: 22 Jan, 2012 23:37:09

Message: 19 of 23

"Roger Stafford" wrote in message <jfhofo$l7e$1@newscl01ah.mathworks.com>...
>
> I think I can see how it might be done in the general case, but it would require a more extensive calculation.
==========

I think this next version does the job, finally. I'm seeing very uniform looking histograms for 1e7 samples.


%This next bit should be computed only once per N

    sets=nchoosek(1:N-1,2);
    Lengths=[sets(:,1), sets(:,2)-sets(:,1), N-sets(:,2)];
    redundancies=1./prod(factorial(Lengths),2);
    cdf=cumsum(redundancies);
      cdf=[0;cdf].'/cdf(end);
 

%Engine

n=find(histc(rand,cdf),1);


len1=Lengths(n,1);
len2=Lengths(n,2);
len3=Lengths(n,3);

A=[repmat([1;0;0],1,len1),...
   repmat([0;1;0],1,len2),...
   repmat([0;0;1],1,len3)];

A=A(:,randperm(N));

Subject: k (3) Vectors of ones and zeros "combining" together

From: Roger Stafford

Date: 23 Jan, 2012 05:30:11

Message: 20 of 23

"Matt J" wrote in message <jfi6j5$2dv$1@newscl01ah.mathworks.com>...
> sets=nchoosek(1:N-1,2);
> Lengths=[sets(:,1), sets(:,2)-sets(:,1), N-sets(:,2)];
> redundancies=1./prod(factorial(Lengths),2);
> cdf=cumsum(redundancies);
> cdf=[0;cdf].'/cdf(end);
> %Engine
> n=find(histc(rand,cdf),1);
> len1=Lengths(n,1);
> len2=Lengths(n,2);
> len3=Lengths(n,3);
> A=[repmat([1;0;0],1,len1),...
> repmat([0;1;0],1,len2),...
> repmat([0;0;1],1,len3)];
> A=A(:,randperm(N));
- - - - - - - - -
  That code looks good to me, Matt. I haven't tried making a histogram of it, but the theory looks perfectly valid. I congratulate you.

  I found it easiest to think of by temporarily allowing zero length counts. Each sequence of length counts a, b, c where a+b+c = N can occur in N!/(a!*b!*c!) ways among the 3^N, presumably equally probable, possibilities, and sum of these will be exactly 3^N. The probability of each possible sequence should therefore be set proportional to 1/(a!*b!*c!) to accomplish this, which is what you have done. Restricting things to non-zero length counts doesn't change that needed proportionality, (even though the above sum would now be a smaller 3^N-3*(2^N-1).)

  About the only problem you might encounter occurs with large N. In those cases it gets a little precarious depending on only one 'rand' call to distinguish between vastly differing probabilities. At the extreme of N = 45 some of the lowest probabilities would amount to only about the value of one bit in a 53-bit floating point number, so they would be highly inaccurate relative to their own size. (Of course happily they wouldn't occur very often.)

Roger Stafford

Subject: k (3) Vectors of ones and zeros "combining" together

From: Matt J

Date: 23 Jan, 2012 15:16:11

Message: 21 of 23

"Roger Stafford" wrote in message <jfir93$11i$1@newscl01ah.mathworks.com>...
>
> That code looks good to me, Matt. I haven't tried making a histogram of it, but the theory looks perfectly valid. I congratulate you.
>
> I found it easiest to think of by temporarily allowing zero length counts. Each sequence of length counts a, b, c where a+b+c = N can occur in N!/(a!*b!*c!) ways among the 3^N, presumably equally probable, possibilities, and sum of these will be exactly 3^N. The probability of each possible sequence should therefore be set proportional to 1/(a!*b!*c!) to accomplish this, which is what you have done. Restricting things to non-zero length counts doesn't change that needed proportionality, (even though the above sum would now be a smaller 3^N-3*(2^N-1).)
=================

Thanks, Roger, although I am a little disquieted that it takes 1e7 realizations to get a decently converged histogram. I would think that with only 150 different possible events, the number of required samples would be far less...


 
> About the only problem you might encounter occurs with large N. In those cases it gets a little precarious depending on only one 'rand' call to distinguish between vastly differing probabilities. At the extreme of N = 45 some of the lowest probabilities would amount to only about the value of one bit in a 53-bit floating point number, so they would be highly inaccurate relative to their own size. (Of course happily they wouldn't occur very often.)
=======

True. But since the OP didn't specify how large N can get, I'm inclined not to let it trouble me.

Subject: k (3) Vectors of ones and zeros "combining" together

From: Apostol

Date: 23 Jan, 2012 21:37:51

Message: 22 of 23

"Matt J" wrote in message <jfjtjr$fij$1@newscl01ah.mathworks.com>...

> True. But since the OP didn't specify how large N can get, I'm inclined not to let it trouble me.


Thank you very much again guys! For a limit-case problem I shouldn't exceed N=20, the code you provided is far better than anything I could have thought of. It's very frustrating that I have to run this many iterations to have a good coverage of all the different combinations, anyway as I said this is perfect for me!

Subject: k (3) Vectors of ones and zeros "combining" together

From: Roger Stafford

Date: 23 Jan, 2012 22:01:10

Message: 23 of 23

"Matt J" wrote in message <jfjtjr$fij$1@newscl01ah.mathworks.com>...
> Thanks, Roger, although I am a little disquieted that it takes 1e7 realizations to get a decently converged histogram. I would think that with only 150 different possible events, the number of required samples would be far less...
> ........
- - - - - - - - - - -
  It's not so surprising, Matt. For N = 5, assuming that all 150 events have equal probability p = 1/150, what you have with n samples is a simple binomial distribution with mean n*p and variance n*p*(1-p) on the count of one of these events. The right quantity to consider in its histogram is the ratio sqrt(variance)/mean which is equal to sqrt((1-p)/(n*p)). As n increases, it drops down only as the reciprocal of the square root of n and the small size of p makes it worse. For larger N the number of events increases and therefore the number of samples needed gets even larger. Also one needs to take into consideration on a histogram that with larger numbers of events, there is an increased opportunity for outliers to appear even for a given value of the above ratio.

Roger Stafford

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us