count rectangles on big matrices

4 views (last 30 days)
freebil on 27 Sep 2013
Edited: Cedric Wannaz on 5 Nov 2013
Hello. I have very large sparse matrices with zero and one only and i have to count all the rectangles that there are in it. For example:
1 0 0 0 1
H=[1 1 0 1 1]
0 1 1 1 0
0 0 0 0 1
There are 2 rectangles inside.
1st--> H(1,1),H(2,1),H(1,5),H(2,5)
2nd--> H(2,2),H(3,2),H(2,4),H(3,4)
I wrote a code but it is very slow with large matrices..
for i=1:m
for j=1:n
if H(i,j)==1
for ii=i+1:m
if H(ii,j)==1
for jj=j+1:n
if H(i,jj)==1
if H(ii,jj)==1
cycles=cycles+1;
end
end
end
end
end
end
end
end
Cedric Wannaz on 28 Sep 2013

Cedric Wannaz on 28 Sep 2013
Edited: Cedric Wannaz on 5 Nov 2013
The only easy improvement that I could think of is to eliminate elements which are alone on their row and/or column. It could be significant if the density is low.
[r,c] = find( H ) ;
n = numel( r ) ;
% - Preprocessor: eliminates entries which are "alone" row or column-wise.
v1s = ones(n, 1) ;
rCnt = accumarray( r, v1s ) ; cCnt = accumarray( c, v1s ) ;
isAlone = rCnt(r) == 1 | cCnt(c) == 1 ;
r = r(~isAlone) ; c = c(~isAlone) ; n = numel( r ) ;
% - Main loop.
nCycles = 0 ;
% Loop through potential upper-left corners.
for k_ul = 1 : n-3
if c(k_ul) == n, break ; end
if r(k_ul) == n, continue ; end
% Find and loop through potential lower left corners.
ix_ll = find( c == c(k_ul) & r > r(k_ul) ) ;
if isempty( ix_ll ), continue ; end
for k_ll = ix_ll.'
% Find all potential columns matching upper right corners.
c_ur = c( r == r(k_ul) & c > c(k_ul) ) ;
% Find all potential columns matching row of lower left corner.
c_r_ll = c( r == r(k_ll) & c > c(k_ll) ) ;
% Count cycles as # elements of intersect.
nCycles = nCycles + nnz( bsxfun(@eq, c_ur.', c_r_ll )) ;
end
end
EDIT Sat@7:30pm : replaced call to INTERSECT (slow) with a solution based on BSXFUN.
I am in a rush and I have to leave, so I don't have time to write explanations or to perform tests, but here is what I guess could be a solution. Have a look and I'll come back later today to discuss it if needed.
[r,c] = find( H ) ;
n = numel( r ) ;
nCycles = 0 ;
% Loop through potential upper-left corners.
for k_ul = 1 : n-3
if c(k_ul) == n, break ; end
if r(k_ul) == n, continue ; end
% Find and loop through potential lower left corners.
ix_ll = find( c == c(k_ul) & r > r(k_ul) ) ;
if isempty( ix_ll ), continue ; end
for k_ll = ix_ll.'
% Find all potential columns matching upper right corners.
c_ur = c( r == r(k_ul) & c > c(k_ul) ) ;
% Find all potential columns matching row of lower left corner.
c_r_ll = c( r == r(k_ll) & c > c(k_ll) ) ;
% Count cycles as # elements of intersect.
nCycles = nCycles + nnz( bsxfun(@eq, c_ur.', c_r_ll )) ;
end
end
On my laptop, counting cycles takes ~1s for a 1e3 x 1e5 sparse matrix filled with a density of 1e-4. There might be a more efficient solution, but if you needed much more efficiency, you should consider a C/MEX based solution.
I will adapt this answer according to what you'll reply to my last comment above, but I'd personally go for a solution based uniquely on non-zero elements. The code below, for example
H = [1 0 0 0 1; ...
1 1 0 1 1; ...
0 1 1 1 0; ...
0 0 0 0 1] ;
[r,c] = find( H ) ;
n = numel( r ) ;
% 1
% - Angles in 1st quadrant: 1 1
%
angles_q1 = zeros( n, 2 ) ;
ia = 0 ;
for k = 1 : n
if r(k) > 1 && c(k) < n
% Check if there are non-zero values above and on the right.
if any( r == r(k) & c == c(k)+1 ) && any( c == c(k) & r == r(k)-1 )
ia = ia + 1 ;
angles_q1(ia,:) = [r(k), c(k)] ;
end
end
end
angles_q1 = angles_q1(1:ia,:) ;
builds an array of angles opened in quadrant I. Running that on the H that you provided outputs
angles_q1 =
2 1
3 2
which gives you the quadrant 1 angles in (2,1) and (3,2).
It should be quite fast even on your large array. The same could be repeated for the other 3 cases, which would generate all the information that you need to "easily" spot rectangles. However, this works only if rectangles can be cut by lines of 1's or interlaced, and it fails if rectangles must be "empty".
Cedric Wannaz on 5 Nov 2013
Hi, I am traveling right now and I won't have time to answer I guess. Post this as a new question and make a reference to this thread.

Image Analyst on 28 Sep 2013
Rather than spend time looping over all the indices, many of which don't have 1's and are thus a waste of time, why not just get the rows and columns where the 1's live and then just loop over them?
[rows, columns] = find(H);
numberOfPoints = length(rows);
for k1 = 1 : numberOfPoints
row1 = rows(k1);
col1 = columns(k1);
for k2 = k1+1 : numberOfPoints
row2 = rows(k2);
col2 = columns(k2);
for k3 = k2+1 : numberOfPoints
row3 = rows(k3);
col3 = columns(k3);
for k4 = k3+1 : numberOfPoints
row4 = rows(k4);
col4 = columns(k4);
etc.
or something like that...