Remove overlapping index ranges within a single array

10 views (last 30 days)
I have set of indices from a larger array in a row-wise manner:
totalMat = ones(15,1); a = [1,2; 3,5; 4,12; 6,11; 8,15];
These are meant to represent indices in a single column array (totalMat) that will be set to some value.
My goal is to determine from a top-down approach which which indices are not overlapping. Eg, the first set of indices (row), which is '1-2', does not overlap with anything, but the second set of indices ('3-5' overlaps with the next one, so the following index would be removed. Having removed the third row, the fourth does not overlap. The final row overlaps with the third, and is removed. The output would be:
b =
1 2
3 5
6 11
Then, ideally, I could set the indices in b to a value - i.e.
totalMat(1:2,3:5,6:11) = 20;
I've done this with while loops and subsequent eval() statements, but I need it to be vectorized to get it done in a reasonable amount of time - my totalMat can be as long as 32 million elements.
I'd like to do away with the eval() if possible. Lastly, this is not a homework question, but from a newbie programmer...
Thanks!

Answers (3)

Azzi Abdelmalek
Azzi Abdelmalek on 24 Jul 2013
Edited: Azzi Abdelmalek on 24 Jul 2013
Edit
a = [1,2; 3,5; 3,6; 4,5; 8,10]
b=arrayfun(@(x) a(x,1):a(x,2),1:size(a,1),'un',0);
k=1;
while k<numel(b)
c=b(k+1:end);
idx=cellfun(@(x) any(ismember(x,b{k})),c);
if any(idx)
b(logical([ zeros(1,numel(b)-numel(idx)-1) 1 idx]))=[];
else
b(logical([ zeros(1,numel(b)-numel(idx)) idx]))=[];
end
k=k+1;
end
out=cell2mat(cellfun(@(x) [x(1) x(end)],b','un',0))
  2 Comments
GradStudent
GradStudent on 24 Jul 2013
The reply is much appreciated. I apologize - I edited the input indices to better describe the problem, making this solution (and all of mine too) not work.
The location and duration of the indices is random, such that a variety of situations come about that make vectorizing difficult. One of these is where one set of indices should be rejected (4,12) but the following one is ok (6,11).
For every row, the value second column will always be greater than the first, and the first column will always be increasing from top to bottom, but aside from that, pretty much anything else goes.
Another test case could be for the following indices:
a2 = [1,2;3,5;4,12;5,15;6,23;10,12;13,14;19,22;24,27;26,28];
This outcome should be:
b2 =
1 2
3 5
6 23
24 27

Sign in to comment.


Azzi Abdelmalek
Azzi Abdelmalek on 24 Jul 2013
a= [1,2;3,5;4,12;5,15;6,23;10,12;13,14;19,22;24,27;26,28];
b=arrayfun(@(x) a(x,1):a(x,2),1:size(a,1),'un',0);
k=1;
while k<numel(b)
c=b(k+1:end);
idx=cellfun(@(x) any(ismember(x,b{k})),c);
b(logical([ zeros(1,numel(b(1:k))) idx]))=[];
k=k+1;
end
out=cell2mat(cellfun(@(x) [x(1) x(end)],b','un',0))
  4 Comments
GradStudent
GradStudent on 25 Jul 2013
Edited: GradStudent on 25 Jul 2013
The only real reason to avoid a while/for loop is for speed reasons. Because the length of indices in totalMat is long (3.2 million elements or longer), compared to a simple for loop:
numInd = 3200000;
a =sort(round((rand(233,1))*numInd));
a = [a a+round((rand(233,1))*50000)];
%Method 1
tic
b=arrayfun(@(x) a(x,1):a(x,2),1:size(a,1),'un',0);
k=1;
while k<numel(b)
c=b(k+1:end);
idx=cellfun(@(x) any(ismember(x,b{k})),c);
b(logical([ zeros(1,numel(b(1:k))) idx]))=[];
k=k+1;
end
out1=cell2mat(cellfun(@(x) [x(1) x(end)],b','un',0));
totalmat1=ones(numInd,1);
totalmat1(cell2mat(b))=20;
toc
%Method 2 - simple for loop
tic
totalMat2= ones(numInd,1);
keeps = true(size(a));
currindex = 0;
for i = 1:length(a(:,1))
if a(i,1)<=currindex
keeps(i,:)=0;
else
currindex = a(i,2);
if a(i,2)<=length(totalMat2)
totalMat2(a(i,1):a(i,2)) = 20;
else
totalMat2(a(i,1):length(a(:,1))) = 20;
end
end
end
out2 = reshape(a(keeps),sum(keeps(:,1)),2);
toc
%Check that the two are the same
isequal(out1,out2)
isequal(totalmat1,totalMat2)
>>Elapsed time is 4.617151 seconds.
>>Elapsed time is 0.022024 seconds.
I originally had find() statements and it was extremely slow (hundereds of seconds) thinking a simple for loop would be too slow. I just wrote one up for comparison purposes, and in this case it seems much faster. Not really sure why that would be...
Jan
Jan on 25 Jul 2013
@Azzi: This can be simplified:
b(logical([ zeros(1,numel(b(1:k))), idx]))=[];
numel(b(1:k)) equals k. And converting ZEROS to LOGICAL is slower than using FALSE directly:
b([false(1, k), idx]) = [];

Sign in to comment.


Jan
Jan on 26 Jul 2013
Edited: Jan on 26 Jul 2013
Some improvements of the loop you've posted as a comment:
totalMat2 = ones(numInd,1);
lenA = size(a, 1);
keeps = true(1, lenA); % as a vector instead of a matrix
currindex = 0;
exceed = a(:, 2) > length(totalMat2);
a(exceed, 2) = length(totalMat2);
for k = 1:lenA % length(a(:,1)) creates a(:,1) explicitly
if a(k,1) <= currindex
keeps(k) = false;
else
currindex = a(k, 2);
totalMat2(a(k,1):a(k,2)) = 20;
end
end
out2 = reshape(a(keeps, :), [], 2); % Automatic size matching
"i" should be avoided as variable to reduce confusions with the imaginary unit.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!