Path: news.mathworks.com!not-for-mail
From: "Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid>
Newsgroups: comp.soft-sys.matlab
Subject: Re: eliminate identical raws
Date: Sun, 17 Feb 2008 07:16:03 +0000 (UTC)
Organization: The MathWorks, Inc.
Lines: 29
Message-ID: <fp8mvj$n1g$1@fred.mathworks.com>
References: <27823314.1203129850043.JavaMail.jakarta@nitrogen.mathforum.org> <fp7991$656$1@fred.mathworks.com> <fp88eh$7l1$1@fred.mathworks.com> <fp8c8e$ami$1@fred.mathworks.com>
Reply-To: "Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid>
NNTP-Posting-Host: webapp-02-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1203232563 23600 172.30.248.37 (17 Feb 2008 07:16:03 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Sun, 17 Feb 2008 07:16:03 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1187260
Xref: news.mathworks.com comp.soft-sys.matlab:451918


"us " <us@neurol.unizh.ch> wrote in message <fp8c8e$ami
$1@fred.mathworks.com>...
\> YES... i KNEW one of the very CSSM seniors( :-) ) would ..........

  There is another aspect of the above (corrected) solution #1 that bothers 
me, Urs, though I admittedly don't have a better alternative ready to put 
forth.  With some of the larger numbers of elements in array 'a', one comes 
up with some discouragingly large arrays for 'x' that have to be greatly 
reduced in size.  Suppose we have a = [1 2 3 4 5 6 7 7 7 7 7 7].  Then the 
number of rows in x before it is reduced would be 7^12, which is a hefty 
1.38e10 rows, whereas the number of valid combinations is only 12!/6!, 
which is a very much smaller 6.65e5 rows.  This is a very severe rejection 
process with only one survivor in every twenty thousand.  In this case, even 
'perms' gives rise to a smaller 12!, which would be only 4.8e8 rows.

  My intuition tells me that for a robust, general solution to this problem 
suitable for all cases for which there is a reasonable number of rows in the 
final solution, it would be best to use an algorithm which does not require 
that any combinations be rejected later on, even if much despised for-loops 
have to be used.  I envision it as a generalization and adaptation of the 
algorithm which is apparently used in the 'nchoosek' function.  John D'Errico 
has written 'loopchoose', which if I understand it correctly, creates only valid 
combinations successively each time it takes a step, with none needing to be 
rejected.  The idea would be to generalize such methods to any number of 
distinct values in the array rather than just the two which, in effect, 
'nchoosek' and 'loopchoose' deal with.

Roger Stafford