Path: news.mathworks.com!newsfeed-00.mathworks.com!solaris.cc.vt.edu!news.vt.edu!news.glorb.com!postnews.google.com!p25g2000hsf.googlegroups.com!not-for-mail
From: Greg Heath <heath@alumni.brown.edu>
Newsgroups: comp.soft-sys.matlab
Subject: Re: eliminate identical raws
Date: Sat, 16 Feb 2008 08:29:28 -0800 (PST)
Organization: http://groups.google.com
Lines: 81
Message-ID: <132e35ff-78fb-4bd8-9f3f-95d17b45db6a@p25g2000hsf.googlegroups.com>
References: <27823314.1203129850043.JavaMail.jakarta@nitrogen.mathforum.org> 
NNTP-Posting-Host: 68.39.14.248
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1203179368 28337 127.0.0.1 (16 Feb 2008 16:29:28 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Sat, 16 Feb 2008 16:29:28 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: p25g2000hsf.googlegroups.com; posting-host=68.39.14.248; 
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; AT&T 
Xref: news.mathworks.com comp.soft-sys.matlab:451829


On Feb 16, 10:55=A0am, "Roger Stafford"
<ellieandrogerxy...@mindspring.com.invalid> wrote:
> vladimir <rus_lo...@comcast.net> wrote in message
>
> <27823314.1203129850043.JavaMail.jaka...@nitrogen.mathforum.org>...> I hav=
e a probability problem that requires matrix A=3D[1 1 1 2 2] going
>
> through perms(A) which gives me many identical rows. How do I eliminate al=
l
> the identical rowa but one? Answer probably is the easy one, but not for a=

> beginner like myself. Thanks much.> Vladimir T
>
> ---------
> =A0 If you have p ones and q twos, the number of different rows would be
>
> =A0nCp =3D (p+q)!/(p!*q!)
>
> where n =3D p+q. =A0In your example, this would be 5C3 =3D 10 out of a tot=
al of 5!
> =3D 120 produced by 'perms'. =A0With such small numbers, using 'perms' fol=
lowed
> by 'unique' to eliminate repetitions, as was suggested by John, is a good
> method.
>
> =A0 However, for larger values of p and q, this method runs into problems.=
 =A0For
> example, if p and q were both equal to 7, then 'perms' would generate an
> array of some 14! =3D 87 billion rows which would then have to be narrowed=

> down to a much smaller 14C7 =3D 3432 different rows, a rather inefficient
> process even if one had enough memory.
>
> =A0 This would suggest that one might consider using matlab's 'nchoosek'
> function which creates all possible combinations of p things out of a set =
of n
> things. =A0That is, it directly creates the magic number of nCp different
> combinations with nothing that need be eliminated. =A0One way to use
> 'nchoosek' for this purpose might be the following:
>
> =A0n =3D p+q;
> =A0C =3D nchoosek(1:n,p);
> =A0N =3D size(C,1); % (N =3D nCp)
> =A0B =3D repmat(2,N,n); % Start with all twos
> =A0for ix =3D 1:N
> =A0 B(ix,C(ix,:)) =3D 1; % Insert p ones into each row
> =A0end
>
> Array B would then have the desired result. =A0(Actually this for-loop cou=
ld be
> eliminated using 'sub2ind' appropriately.)
>
> =A0 A note of caution. =A0Unfortunately I don't have 'nchoosek' to experim=
ent with
> on my system, and there is one aspect of it that is worrisome. =A0In their=

> documentation MathWorks gives the same "practical" limit on the use of
> 'nchoosek' as it does for 'perms', namely that n ought not to exceed 15. =
=A0One
> would think that 'nchoosek' ought to be able to deal with considerably lar=
ger
> values of n than this without encountering the difficulties of 'perms'. =
=A0This
> would appear to suggest that perhaps 'nchoosek' also does something crude
> like a 'perms' operation followed by a call to 'unique', in which case the=
re
> would be no particular advantage to using 'nchoosek' as given above. =A0A =
huge
> array would be created in either case. =A0Can anyone in this group tell me=
 if that
> is true?
>
> Roger Stafford

Neither PERMS or UNIQUE are used in NCHOOSEK. However,
NCHOOSEK calls internal function COMBS which is somehow
defined with a recursive call to itself.

Hope this helps.

Greg