How can i use nchoosek to output both the k combinations and the remaining combinations?

JT (view profile)

on 29 Mar 2016
Latest activity Commented on by Paul Smits

Paul Smits (view profile)

on 29 Apr 2019
I would like to run the nchoosek function and return all the k combinations as a matrix (which I can do already) but also the relavent remaining values in a separate matrix. A splitting process of all the values based on the k value.
For example if I nchoosek (where n=10 and k=4) it would output all combinations of 4 values in one matrix and all the remaining 6 values in another matrix.
Is this possible?

Jos (10584)

Jos (10584) (view profile)

on 28 Apr 2019
yes, rather easy: use nchoosek(1:n, n-k) and flip the output

Products

Answer by Paul Smits

Paul Smits (view profile)

on 26 Apr 2019

Fastest solution so far is my rahter clumsy attempt at using logical indexing to solve the problem:
m = 10;
n = 4;
b = nchoosek(m,n);
ii = true(m,b);
ii(nchoosek(1:m,n)'+(0:m:m*b-1))=false;
xr = repmat([1:m]',1,b);
v = reshape(xr(~ii),n,b)';
w = reshape(xr(ii),m-n,b)';
Readability can be improved upon. But definitely fast.
Note that this method can also take input other than 1:m:
x = [8 8 8 6 1 2 3 4 10 4]; %other input
n = 4;
m = size(x,2);
b = nchoosek(m,n);
ii = true(m,b);
ii(nchoosek(1:m,n)'+(0:m:m*b-1))=false;
xr = repmat(x',1,b);
v = reshape(xr(~ii),n,b)';
w = reshape(xr(ii),m-n,b)';
Sidenote: A possibility for nchoosek to output remaining values directly would be quite a nice feature.

Jan

Jan (view profile)

on 28 Apr 2019
I've posted a method, which avoid the 3 transpositions. I've tried to test it in Matlab Online without success for 15 minutes: Either the editing does not work, pressing the backspace key deletes the window, hitting Ctrl-Return executes one line only and not the complete code, etc. I gave up. Matlab Online is not working reliably. What a pity.

Answer by Jos (10584)

Jos (10584) (view profile)

on 28 Apr 2019
Edited by Jos (10584)

Jos (10584) (view profile)

on 28 Apr 2019

The remaining values can simply be obtained using nchoosek(1:n, n-k), you just have to flip the order of the output :-)
n = 7
k = 2
A = nchoosek(1:n, k)
remainingValues = flipud(nchoosek(1:n, n-k))
% a check that the combination contains all numbers
tf = sort([A remainingValues],2) == 1:n ;
all(tf(:)) % TRUE!

Jan

Jan (view profile)

on 29 Apr 2019
+1. Nice and neat. The order of outputs of nchoosek is not documented, so it is worth to include this test in the unit-tests of the function, such that it can be checked for each Matlab release the code is running on.
Jos (10584)

Jos (10584) (view profile)

on 29 Apr 2019
Thanks Jan :-)
You're right that the order is not documentd, but yet I think that for any decent algorithm the order of the output should not depend on K, so that nchoosek(1:N, K) and nchoosek(1:N, N-K) have to be symmetrical, in the sense that, for instance, [1 2 ... K] and [1 2 ... N-K] occupy the same row in both outputs (in my version row 1).
Paul Smits

on 29 Apr 2019
Excellent!

Jan (view profile)

on 26 Apr 2019

% The loop version:
m = 10;
n = 4;
v = nchoosek(1:m, n);
nv = size(v, 1);
w = zeros(nv, m - n);
for k = 1:nv
t = 1:m;
t(v(k, :)) = [];
w(k, :) = t;
end
There must be a non-loop version also, but it will need larger temporary arrays.

Paul Smits

Paul Smits (view profile)

on 26 Apr 2019
Uses v values directly as an indices in t(v(k, :)). Only works for input 1:m.
Jan

Jan (view profile)

on 28 Apr 2019
@Paul Smits: Yes, but it is trivial to expand this:
data = rand(1, 10);
... % my code
vdata = data(v);
wdata = data(w);

Answer by Paul Smits

Paul Smits (view profile)

on 26 Apr 2019

I tried a way without loops, using perms instead of nchoosek:
m = 10;
n = 4;
per = perms(1:m);
[v,iper,~]= unique(sort(per(:,1:n),2),'rows');
w = sort(per(iper,n+1:end),2);
Albeit shorter, this is significantly slower than Jan's solution. Gets real slow for m of 9 or higher.

Jan

Jan (view profile)

on 29 Apr 2019
Of course creating all permutations at first requires a lot of RAM.