nchoose2: save output in chunks?

3 views (last 30 days)
phlie
phlie on 19 Sep 2016
Commented: phlie on 22 Sep 2016
Hi everyone, I have a cell array N(m,n) with mixed numeric/ string (with the first row as a header). I would like to create combinations without repetition of every row i with every other row j ≠ i. I am doing this with user-written nchoose2.
ind = nchoose2(1:size(N, 1)-1);
Unfortunately, my cell array is too large so that ind generates an out-of-memory error. Can I save the output of nchoose2 (or I wouldn't mind using nchoosek) in chunks? Like save the first 50k rows of the ind, process them, delete them, and then turn to the next 50k?
  2 Comments
José-Luis
José-Luis on 19 Sep 2016
Do you have any idea how large your total output would actually be?

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 19 Sep 2016
Neither nchoosek nor nchoose2 let you return a portion of the output.
You can always generate the output using a loop and break out whenever you want:
function [rowcombination, nextfirstrow, nextsecondrow] = choose2row(in, maxrows, startfirstrow, startsecondrow)
%CHOOSE2ROW create every combination of 2 rows of a matrix/cell array
%The function can return a portion of the output and be called again to return the next portion.
%The function uses double loops to compute all combinations.
%Outputs:
% rowcombination: matrix/cell array where each row is the concatenation of two distinct rows of the original matrix/cell array.
% nextfirstrow:
% nextsecondrow: parameters to pass back to a subsequent call to CHOOSE2ROW to return the next portion of row combination.
%Inputs:
% in: input matrix/cell array of size [m, n].
% maxrows: maximum number of rows of output rowcombination. Inf for no limit. Scalar, optional. default Inf.
% startfirstrow: outer loop start index. Scalar, optional. default 1.
% startsecondrow: inner loop start index. Scalar, optional. default startfirstrow - 1.
if nargin < 2 || maxrows == Inf
maxrows = Inf;
else
validateattributes(maxrows, {'numeric'}, {'scalar', 'positive', 'integer'}, 2);
end
if nargin < 3
startfirstrow = 1;
else
validateattributes(startfirstrow, {'numeric'}, {'scalar', 'positive', 'integer', '<', size(in, 1)}, 3);
end
if nargin < 4
startsecondrow = startfirstrow + 1;
else
validateattributes(startsecondrow, {'numeric'}, {'scalar', 'positive', 'integer', '<=', size(in, 1), '>', startfirstrow}, 4);
end
nrows = (size(in, 1) - startfirstrow + 1) * (size(in, 1) - startfirstrow) / 2 - (startsecondrow - startfirstrow - 1); %total size of output still to generate
rowcombination = repmat(in(1, :), min(nrows, maxrows), 2); %initialise output to required size
rowout = 1;
for nextfirstrow = startfirstrow : size(in, 1)-1
for nextsecondrow = startsecondrow : size(in, 1)
rowcombination(rowout, :) = [in(nextfirstrow, :), in(nextsecondrow, :)];
rowout = rowout + 1;
if rowout > maxrows
nextsecondrow = nextsecondrow + 1; %#ok<FXSET> exiting the loop
if nextsecondrow > size(in, 1)
nextfirstrow = nextfirstrow + 1; %#ok<FXSET>
nextsecondrow = nextfirstrow + 1; %#ok<FXSET>
if nextfirstrow == size(in, 1)
nextfirstrow = Inf; %#ok<FXSET>
nextsecondrow = Inf; %#ok<FXSET>
end
end
return
end
end
startsecondrow = nextfirstrow + 2;
end
nextfirstrow = Inf;
nextsecondrow = Inf;
end
Of course, you're trying performance for memory.
  1 Comment
phlie
phlie on 22 Sep 2016
Thank you, Guillaume. This works very well!

Sign in to comment.

More Answers (0)

Categories

Find more on Argument Definitions in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!