Generates restricted and unrestricted integer compositions
This is a Matlab implementation of a unique algorithm by J. D. Opdyke with very good properties for solving the Integer Composition problem of finding all permutations in the additive partitioning of integers. It also accepts lower and upper bounds for the number of sum terms as well as lower and upper parts for the sum term values. Quite useful for applications in combinatorics. The program is a direct implementation of the pseudo-code contained in the official publication in
J. Math. Modeling and Algorithms (2009) V9 N1, p.53 - 97
It is not necessarily matlab-optimized and it was not tested for very large numbers but seems to work pretty fast for integers up to 100.
John E. (view profile)
Tried this out and is pretty slow. Check this out: jdoric(1000,1000,1000,1,1000) - runs forever (didn't even terminate on my PC) although there is just a single integer composition, namely 1000 = 1+1+...+1. Runtime increases exponentially in the number of parts.
Tim Johnson (view profile)
Tried this-worked out perfectly. Was fast.
iuvaris (view profile)
Well, it's still my firm view that this algorithm (not the implementation) is awfully slow and no bit more general than other algorithms for restricted integer compositions.
For example, when I set n=16; a=1; b=4; and run
tic; jdoric(n,1,16,a,b); toc
this yields, on my computer, a run time of 21 seconds. Contrarily, when I run
tic; for k=1:16; colex(n,k,a,b); toc
this runs in 2.5 seconds. (Colex is the Vajnovszki/Vernay algorithm that is also available as a matlab implementation).
When I set n=18, run times become 289 seconds and 25.6 seconds, respectively.
So, why on earth would anyone want to use this slow algorithm?
Other issues:
1) The good rating of this algorithm here on mathworks appears to be manipulated.
2) You may wish to google "Corrections" to the Opdyke algorithm "A unified approach ..."
iuvaris (view profile)
Alright, I agree, John D'Errico, that the argumentation should remain factual.
It's just really difficult to argue with Mr. Opdyke because he apparently doesn't understand some basic principles of the things he has published about. Writing problematic papers is one thing, defending errors beyond reason is another. Whatever his contributions in his paper have been, they have neither been the provision of solutions to open problems nor the provision of a general algorithm for solving the restricted integer composition/partition problem. I think that the "success" of his paper is due to self-promotion - and recipients of his work should be made aware of his paper's and algorithm's deficiencies.
In the interest of his readers, I have summarized my points of critique in a pdf file.
http://www.adiuvaris.org/correction.pdf
John D'Errico (view profile)
PLEASE! Intelligent comments from all parties are valid and invited. But STOP the venomous comments, character assassination, insults, and what has been verging on libel.
ACT LIKE ADULTS, OR PROVE YOURSELVES TO BE SMALL CHILDREN.
iuvaris (view profile)
Lastly, JD Opdyke, why do you never mention hirosh tagoya, Mark Headstrom, Nam Quire, Jasper Baxxter?
It's seems your mathwork comments are just as suspicious as your journal publication.
Riley Jones (view profile)
Now I see what Hirosh and JD were laughing about. But I’ll take a quick stab at it, because it really is quite simple:
1) An algo that is not O(1) is not necessarily “inefficient” – there is often a tradeoff, in the real world (as opposed to obscure academia) between the generality of an algo and its speed/time complexity.
2) the counting formulae were not “sold as new,” and there is no analogous result to Mansour’s for partitions; that’s why the FORMAT of the two counting formulae were presented, because in that specific format, a link between the two objects is readily apparent. And again, they are ancillary to the main point of the paper (but a nice way to try to obfuscate the main topics and contributions of the paper).
3) there were no “errors” in the algo, the paper, or the code
4) waving your hands and saying that anything that applies to compositions applies to partitions because the former just deals with orderings of the latter shows gross ignorance of the literature on algos that generate each – typically they are very, very different, which is a major contribution of the paper: this is the first paper to use the SAME algo and the same mathematical construct to generate both, under the very general conditions of singly and doubly restricting them
5) your result re: a=0 is correct, but again this only applies to compositions, not also to partitions, as does the JD Opdyke algo
6) subjective comments like “silly” just show you’re grasping at straws
7) its amazing how quickly slanderous comments about who’s posting the blog participants’ comments disappear when people are confronted about them. Did you, iuvaris, in fact ask Tor R or Pierro or Dave or Oscar or Harry if they know each other? Hmmm? Didn’t think so.
iuvaris (view profile)
Well, it is really extremely simple. The JD Opdyke paper has the following major flaws:
1) The Odyke algorithm takes O(k) time per composition of n with k parts. Thus, the algorithm is inefficient.
2) The counting formulae are copied from previous work and sold as new. It is well-known that the number of compositions of n with k parts, each in a set A, satisfies the recursion:
c_A(n,k) = sum_{a in A} c_A(n-a,k-1)
(see Heubach and Mansour, Compositions of n with parts in a set A, Proof of Theorem 2.1)
Specializing to A={1,...,r} gives the formulas that the JD Opdyke paper derives as "novel".
3) All errors made concerning compositions translate to partitions (which are just ordered compositions)
4) An algorithm that runs in time O(k) per composition needs not solve the restricted integer composition problem in its full generality (arbitrary lower and upper bounds a and b) since the general case can be reduced to a=0.
5) To worsen things for the Opdyke paper, the algorithm solves the problem with upper and lower bounds on the NUMBER OF PARTS in a quite silly way by simply invoking the inefficient Opdyke algorithm for each part.
6) This said, the Vajnovszki algorithm is trivially as general as the Opdyke algorithm is and runs incredibly much faster as k increases since it is not inefficient.
hirosh tagoya (view profile)
First, the J.D. Opdyke algo covers i) either partitions or compositions; ii) upper AND lower bounds on the number of parts; iii) upper AND lower bounds on the size of the parts; and iv) ANY combination of i), ii), and iii). As stated in the paper, “Given its generality, the algo is reasonably fast.” The paper does not claim the algo is a grey code algo, or that its complexity is O(1), only that it is reasonably fast given its generality. And this is true. Of course much more specific algos can be made faster, but they lose generality (and you also have to re-code … JD Opdyke provides working code in the paper … and this website provides MATLAB code … so is waiting 10 seconds to use ready-made code really going to kill you when you’d have to code and debug the others??). AND speed is not all here: this is the first algo to make the ALGORITHMIC LINK between the two objects (partitions and compositions) under either singly OR doubly constrained conditions. This alone is publication-worthy (but was lost on critics who did not read the paper thoroughly/carefully, if at all).
Secondly, the algo is developed and written in the paper; this website simply coded it (the version for compositions only) in MATLAB. SAS Code that implements it, which has been made idiot-proof so that incorrect inputs (e.g. a part value=zero) cannot be entered, is included in the paper, and it works flawlessly. The paper states in over half a dozen places that part value=zero is not allowed by the algo. What a bizarre thing to assume and try to make an issue of…
Thirdly, the main point of the paper is the algorithm: the counting formulae are ancillary, and they PERHAPS can be derived from previous results (although nobody has demonstrated this), but that’s irrelevant because their FORM is original, and because of this specific form, this is the first time the LINK between counting the two objects – partitions and compositions – has been demonstrated under the double constraints of upper AND lower bounds on BOTH the number of parts AND the size of the parts. So in this sense they are indisputably original, but this isn’t even the main point of the paper. Folks with too much time on their hands (and obscure papers that nobody ever reads) have tried to latch on to and pull any thread they could make up to cause obfuscation, and this is one such irrelevant thread.
Lastly, why doesn’t “iuvaris” ask Tor R or Pierro or Dave or Oscar or Harry if they know each other? I know the author, and he is not any of them. The tone of the thread became so purile and off-point that he said he’d only respond to substantive comments. I told him I’d seen this doing a google search, and he laughed, thanked me for using the algo and any (substantive) feedback I had, and told me not to waste my time responding to the blog. But the algo’s been so useful in my work (and that of others: over 7k downloads and half a dozen citations in peer reviewed journals and presentations that I’ve seen) that I had to, as a third party, expose the pettiness (and incorrectness) of the ‘critics’ in the thread to date. There are additional points to make, but not worth the time. The critics are ill-informed, have an agenda and too much time on their hands, and clearly did not read the paper carefully.
iuvaris (view profile)
Again, you may want to look at
http://www.mathworks.com/matlabcentral/fileexchange/44186-restricted-integer-compositions-with-fixed-number-of-parts
for a potentially much faster algorithm based on a different paper.
iuvaris (view profile)
First, thanks to the author for providing this implementation.
However, both the implementation and the algorithm/paper it is based upon are problematic, from my point of view:
(1) Line 41 of the algorithm reads as
if a \le ntmp && ntmp \le b && cell(level+1)
which, if you compare it with the pseudocode in the Opdyke paper, is not correct (it should be cell(level+1)\le b). This is not the only mistake in the implementation.
(2) In the description, it says that the implementation "[...] seems to work pretty fast for integers up to 100". Now, there are \binom{n-1}{k-1} integer compositions of an integer n with k parts and e.g. \binom{100-1}{50-1} is 5.0446e+028. I would be really suprised if this algorithm were "pretty fast" at enumerating all these compositions.
Concerning the Opdyke paper. From my perspective, this paper is highly suspicious and problematic:
(a) The claims concerning the novel formulas that the paper "discovers" are factually wrong. This paper discovers no new formulas. It copies them from the work it cites.
(b) The algorithm presented, to me, is pretty clumsy and it is not clear, to me, what this algorithm is actually doing and whether it is efficient at all (this may be a subjective point of view - I leave it for others to object to this). I think that it might be computing the same things over and over again, which makes it pretty slow, to my opinion. (The paper says that it "almost [never]" does this, but what does this mean?)
(c) To actually test the performance of this algorithm, I implemented the very short and simple algorithm by Vajnovszki, Generating permutations with a given major index, http://arxiv.org/abs/1302.6558, see http://www.mathworks.com/matlabcentral/fileexchange/44186-restricted-integer-compositions-with-fixed-number-of-parts. Both implementations are in matlab, so it is easy to compare. Suppressing the output in both implementations, running times (on an octave platform) for n=20, a=1, b=20, for different values of k (k=mink=maxk in the Opdyke algorithm) are as given below (top is Opdyke, bottom is Vajnovszki). As you can see, except for low k, the Opdyke algorithm is much slower than the Vajnovszki algorithm. E.g., for k=17, running time of the Opdyke algorithm is 22.3 sec vs. 0.079 sec for the Vajnovszki algorithm, whence the Opdyke algorithm is slower by a factor of 282!!
k = 6
Elapsed time is 0.81 seconds.
Elapsed time is 1.3 seconds.
k = 7
Elapsed time is 2.6 seconds.
Elapsed time is 2.8 seconds.
k = 8
Elapsed time is 7.1 seconds.
Elapsed time is 4.9 seconds.
k = 9
Elapsed time is 16 seconds.
Elapsed time is 7.4 seconds.
k = 10
Elapsed time is 27.9 seconds.
Elapsed time is 8.45 seconds.
k = 11
Elapsed time is 41.4 seconds.
Elapsed time is 8.46 seconds.
k = 12
Elapsed time is 52.6 seconds.
Elapsed time is 7.27 seconds.
k = 13
Elapsed time is 57.8 seconds.
Elapsed time is 3.95 seconds.
k = 14
Elapsed time is 55.6 seconds.
Elapsed time is 2.06 seconds.
k = 15
Elapsed time is 46.3 seconds.
Elapsed time is 0.833 seconds.
k = 16
Elapsed time is 33.3 seconds.
Elapsed time is 0.215 seconds.
k = 17
Elapsed time is 22.3 seconds.
Elapsed time is 0.079 seconds.
k = 18
Elapsed time is 13.5 seconds.
Elapsed time is 0.009 seconds.
k = 19
Elapsed time is 7.86 seconds.
Elapsed time is 0.001 seconds.
(d) Finally, it seems that the last couple of comments - those with pretty unusual language and which are so 'euphoric' - are written by the author of the Opdyke paper himself ...
(e) To conclude, I think that this implementation is an accurate copy of the pseudocode presented in the Opdyke paper, with a few mistakes. However, I think that it is highly doubtful whether the Opdyke algorithm is efficient at all. Given the experiments I have outlined, I would judge that the Opdyke algorithm is of no practical value and cannot compete with current state-of-the-art.
Nam Quire (view profile)
Jasper sent this email to me as well, we're all from the same research group. The JD Opdyke algorithm is cogent, clear, refreshingly straightforward, adequate and amazingly elaborate. I could not find a single flaw in this well-written, original, and staggeringly enlightening paper. The SAS code he provides is original, creative and demonstrates GURU-level expertise. Reading this wonderful paper has changed my view of mathematics.
Tor R (view profile)
The Raptis algorithm works perfectly for me, for valid inputs, which the user has to take care of. So, great work, 5 stars for the algorithm.
Concerning the Opdyke paper, regards the algorithms, 5 stars here as well.
The formulas appear problematic, however. The presumably "novel" formulas concerning integer compositions are seemingly taken from
Heubach, Silvia, Mansour, Toufik, Compositions of n with parts in a set
Thereom 2.1, which is reference 6 or so in the paper. So, there has been no "open problem" that the paper has solved. But, as Jasper Baxxter points out, we're Matlab USERS, not rocket scientists. So, I don't care for the counting formulas as long as the algo is correct. Overall, 5 stars from me.
Mark Headstrom (view profile)
Jasper sent this to me too, and I echo Dave's comments: this works perfectly. The JD Opdyke paper's refreshingly straightforward, clear, and easy to read, and the SAS code he provides is very advanced, efficient, right, and amazingly idiot-proof. Raptis' time was well spent.
Dave (view profile)
Jasper's exactly right - this works out perfectly. I used and tested extensively the JD Opdyke SAS Code provided in his paper, which works flawlessly. My two RA's also have used the MATLAB code and we've gotten identical answers. The paper is clear, cogent, and extremely useful and useable: kudos to Raptis for identifying and making even more widely available to both researchers and industry.
Jasper Baxxter (view profile)
This works perfectly for me -- the JD Opdyke paper says no parts = 0, and it works as advertised (and that's such an obscure, unused case anyway). So I don't know what all the kvetching is about below; as everyone has confirmed, the two who are sniping are simply wrong (the enumeration algorithm(s) are the point of the paper - the counting formulae are ancillary, but are also right). I've forwarded this to a couple of my colleagues for their review and confirmation. This original work already has been extremely useful to me.
Pierro (view profile)
Works all fine for me. No parts = 0 are allowed.
SP (view profile)
Hi, thanks Roloph, for supporting my view. Yes, I also think the formulas might be plagiarism but, after all, this whole paper is sometimes quite inaccurate and I'm sure the authors did not understand the difference between weak compositions and compositions (otherwise, why state results on weak compositions?) or even might not have noticed that their formulas are explicitly stated in referenced work (incredible as it seems) ...
In any case, I have notified the journal editor.
Rolph (view profile)
Well, interesting discussion :). I didn't check the code as well, unfortunately, but I think I need to defend Steffen regarding the Opdyke paper. (Sry, this post is not on the matlab implementation as well)
First, I agree, this paper should not state results about weak compositions if it excludes them. Otherwise, it should tell the reader that these objects differ from what the paper is analysing. But, still, this appears as a minor flaw to me. It does not merit extended discussions, I feel :)
Much more severe is the issue about the "novel" combinatorial results the paper lists (as quoted by Steffen). Clearly, these results are elementary and can be found in any undergraduate textbook on combinatorics. One might argue that the author was just sloppy or is not an expert but he apparently takes these results directly from the papers he cites. This is clearly a form of plagiarism (unethical appropriation of another's work, to my understanding). This is a really severe issue, particularly in the US.
SP (view profile)
@J.D.
I'm sorry, but I need to make another statement, to again illustrate the deficiency of the Opdyke paper. On p. 13, it verbally says:
"Although the recursive nature of (1)-(4) makes these formulae less convenient than, say, a simple combinatoric equation or sum, they still provide closed form solutions to problems which had none before [...]"
Now, equation (1) is just the Fibonacci identity for integer compositions, which is well-known since ages. Kimberling has it; every elementary text-book has it. Eq. (2) is nothing. Eq. (3) is just the Vandermonde identity for restricted integer compositions; this is well-known for ages as well (besides being trivial): The Heubach and Mansour (2004) you quote paper has it on p. 2 without even giving a proof. And equation (4) is just the summation of (3) --- this is entirely trivial.
And you want to sell these results as new? And the journal editor and the referee did accept this? And you find my comment regarding the quality of that journal subjective? Excuse me, sir, something has gone wrong with parts of your publication ...
And your Kimberling citation: You cannot simply reference "a related result" using the same notation as you use. This is just bad style.
Finally, regarding your notation and your abuse of "explicitness": Compare your cumbersome notation with the elegant notation in Heubach and Mansour (2004) to see the differences. You would just have needed to state in a single place (usually called "Notation" or so): Henceforth, let always 1 \le a \le b \le n. [All other cases are trivial anyway]
SP (view profile)
Btw. the only things I was saying originally are.
1) This implementation is false. Period. It is. There are verifiable bugs in the code. They may not matter much and maybe they don't matter at all if a is greater than zero. But this implementation is not the same as the pseudocode given in that very precise and exact Opdyke paper.
This has only referred to the implementation (but J.D. has apparently something to say on this well although he never checked the implementation, as he says).
And what I'm saying now additionally is (well, I always had that opinion): The Opdyke paper is confused and confusing, particularly in its notation (we're through this ...) and also in some statements; e.g., where it says there is an open problem of finding a closed-form solution to the restricted integer composition problem and where it claims that it provides such a solution (where is this solution? And don't say it's formulas (1),(2),(3), and (4) --- they've been around for ages ...)
SP (view profile)
@J.D.: Discussions are very personal now. I just want to add the following.
(1) Your paper is not clear and it is not consistent. You apparently explicitly base your paper on Kimberling, refering to his notation and to his results in several instances. Throughout, you use the notation c(n,k,a,b) to refer to integer compositions of n with k parts, each between a and b. A very superficial check reveals that you use the notation c(n,k,0,b) at least twice, primarily, when referring to to theoretical results. You should not mention weak compositions, as you do, if you feel that such compositions are "so rarely used not to merit a mention". And you should also not mention them if you want to explicitly exclude them ...
You see, here is the sentence: "Following Kimberling's (2001) notation, if we restrict the values of those compositions by a minimum value of a and a maximum value of b for c(n,k,a,b), then with b=infinity [...] we have c(n,k,a,infinity)=c(n-ka,0,infinity)"
That settles the point, I feel ...
(2) I know research groups that like weak compositions. But maybe they do not merit a mention, as you say.
(3) You should be happy to hear that your algorithm works for a=0 as well. The reason is, it seems to me, that you base your work on Kimberling and he seems to like the a=0 case as well ;)
(4) If your algorithm works even if some of its aspects are not considered (as it is claimed to hold true in this implementation), then maybe you should recheck if your algorithm cannot be simplified or generalized.
(5) To me, it seems that you mix up "objectivity" with "applause".
J.D. (view profile)
Point of Clarification on comments below: All references to "partitions" are made equally to partitions and compositions. For example, the first and third sentences of the paper are: “A list of integers greater than zero that sum to the positive integer n is an integer partition of n. … When the ordering of the summands (“parts”) matters, these become the integer compositions of n…” The paper presents three algorithms: one for partitions, one for compositions, and one for both. Theophanes coded the one for compositions, but the comments below apply to all three.
J.D. (view profile)
@Steffen:
1) THE VERY FIRST SENTENCE of the Opdyke paper reads, “A list of integers greater than zero that sum to the positive integer n is an integer partition of n.” Not sure how much more clear I can be. This is consistent throughout the paper, both via part values implied by the values of the algo’s parameters, and explicit mention in the error conditions in the code in the Appendix.
2) Re: definitions in cited papers (i.e. Kimberling): no paper can (or should) list all the conditions/caveats of all the papers it cites. That would be impossible to do, and not desireable.
3) You’re weak
partitions are so rarely used as to (arguably) not merit a mention. But just in case, the code in the paper explicitly checks if the user enters any parameter value that would require partition parts with values other than an integer greater than zero, and rejects the run with an error message. The Matlab code does not appear to do this, but it also appears to run flawlessly under the conditions explicitly listed in the paper.
4) I’m not a Matlab user but have run the Matlab code contained herein and consistent with the other comments, it has worked flawlessly on all runs tried to date.
5) A more accurate and professional comment on the code and algorithm would have been: “Users please note: as mentioned explicitly in the paper (and the code contained in the paper), the algorithm does not include weak partitions, i.e. partitions with any parts with a value of zero.” This is a clear, cogent, true statement, without gratuitous, ad hominem attacks.
6) Please spell my name correctly (‘Opdyke’ not ‘Updyke’). Everyone else who wrote it out on the page managed to; taking such care is just professional courtesy.
@Oscar, Harry, Theophanes:
Thanks for your objective (and positive) comments about your use of the code and algorithm. It is a privilege to have my paper used widely and translated into multiple languages for multiple uses (to date, in published papers on pricing covered bonds; stock trading algos; other combinatorics papers; and presentations on complexity algorithms). It has been downloaded over 7,000 times and is one of the highest rated research papers on Scrib, so I’m very pleased it appears to have been so well received. It was a lot of fun to work on and write, and I hope it is of great use to you. My sincere thanks for your interest and professionalism.
SP (view profile)
@Oscar: Yes, some people make a distinction between weak compositions and compositions ...
But, the paper you cite, the Updyke algorithm, deals with restricted integer compositions, which usually restrict parts to lie within an arbitrary subset of the nonnegative integers or within an interval. For instance, read the background section of the cited Updyke paper. Or read the introduction of Kimberling, which is cited by Updyke: http://www.fq.math.ca/Scanned/39-5/kimberling.pdf.
(In particular, note how Updyke is unclear on which definition to use: the Introduction says something different than the background section and than Kimberling)
What's more problematic is that the Updyke algorithm works for any a>=0, but this implementation does not, because there are a few minor programming mistakes (but with major consequences): I listed them; in particular, in two lines a condition that has the form
if A \le B
in the Updyke paper reads as
if A
in the implementation, which is clearly wrong.
Defining compositions for nonnegative integers is standard, allowing negative numbers is not.
Óscar (view profile)
In my opinion, it should just work for those cases the original paper says. Furthermore, we should not forget compositions and partitions are usually defined for strictly positive integer numbers:
http://en.wikipedia.org/wiki/Composition_(number_theory)
http://en.wikipedia.org/wiki/Partition_(number_theory)
If you personally prefer any other definition (including zeroes, for example) they can be obtained from the one given by this function. See my first post.
Anyway, what Steffen is trying to compute are "weak compositions" and the goal of this function is to compute "compositions". It would be interesting to have a general implementation for "weak compositions" allowing also independent upper and lower bounds for each term.
It is true that this function should return all the compositions/partitions in a cell array (or array when all of them have the same length). Also, it should check the input arguments to avoid prohibited values.
An additional question: what happens if argument "a" takes negative values? Personally, I think it is just as weird as setting a=0.
SP (view profile)
The program should work for any a>=0 and a \le b but is faulty for a=0, unlike the pseudocode on which the program is based.
This has to do with the following wrong lines of code (a nicer code would make the program easier to debug ...), which errors can be found by comparing the given code with the original pseudocode:
if a \le ntmp && ntmp \le b && cell(level+1)
should be replaced by (this bug occurs twice, on l.41 and l.48)
if a \le ntmp && ntmp \le b && cell(level+1) \le b
After, l.50,
N2N(n,i,a,b,row-a-q2+1,col-1,level+1,tmp,cell);
should be a "break" statement.
if level>0 && row>1
l.57, should possibly be replaced by
if level>0 && row>=1
Making these corrections makes the program apparently work fine in my case, but this program should really be tested more thoroughly, to assure its quality.
Harry Commin (view profile)
This works correctly on every set of inputs I've given it so far (some thousands).
One slight issue is that the function prints compositions to the console, rather than returning them in an array.
A quick and ugly fix is to pass some variable "comps" into and out of the function N2N(). Then, replace the two disp() calls with "comps = [comps;cell]" and "comps = [comps;n]", respectively. I'm sure there's a much neater way of doing this, though.