count rectangles on big matrices

Hello. I have very large sparse matrices with zero and one only and i have to count all the rectangles that there are in it. For example:
1 0 0 0 1
H=[1 1 0 1 1]
0 1 1 1 0
0 0 0 0 1
There are 2 rectangles inside.
1st--> H(1,1),H(2,1),H(1,5),H(2,5)
2nd--> H(2,2),H(3,2),H(2,4),H(3,4)
I wrote a code but it is very slow with large matrices..
for i=1:m
for j=1:n
if H(i,j)==1
for ii=i+1:m
if H(ii,j)==1
for jj=j+1:n
if H(i,jj)==1
if H(ii,jj)==1
cycles=cycles+1;
end
end
end
end
end
end
end
end
Any ideas please?

10 Comments

What is the size of your matrix?
dpb
dpb on 27 Sep 2013
Edited: dpb on 28 Sep 2013
One think you can do that may/may not help much depending on the data structure -- you can abort the loop on any row column that has
sum(i,:) < 2
for rows or
sum(:,j) <2
for columns as there can be no second corner on that row/column
ADDENDUM:
Also if you keep track of where the first/last are you can run the loops over only the actual range instead of entire row/column dimension
Cedric
Cedric on 28 Sep 2013
Edited: Cedric on 28 Sep 2013
How large is the sparse matrix? And do you need only the count or locations/dimensions as well?
Also, in your example, if H(4,4) were 1, would it define a rectangle {H(2,4:5),H(3,4),H(4,4:5)} ?
Can you explain why you need this?
freebil
freebil on 28 Sep 2013
Edited: freebil on 2 Oct 2013
@azzi Abdelmalek The size of sparse matrices will be (10^3,10^5). The matrices are sparse only with one and zero.
@dpb Thanks, but every row or column have 3 or more ones..
@Cedric Wannaz I want only to count.I am interesting only in all 4 ones that create a rectangular.If H(4,4)=1 , i will have the rectangular H(2,4),H(2,5),H(4,4),H(4,5).
Cedric
Cedric on 28 Sep 2013
Edited: Cedric on 28 Sep 2013
And what about the following situations:
0 0 1 0 0
1 0 1 0 1 Is it 1 or 2 ?
1 1 1 1 1
0 0 1 0 0
0 0 1 0 0
1 0 1 0 1 Is it 1 or 0 (because cut) ?
1 1 0 1 1
0 0 1 0 0
1 1 1 1 1
1 0 0 0 1 Is it 1,2,4 ?
1 0 0 0 1
1 1 1 1 1
@Cedric Wannaz thanks for the answer.
1st matrix: 2 cycles 2nd matrix: 1 cycle 3nd matrix : 15 cycles!!
for every one I am searching for cycles, but there are not same cycles..
Ok, I am starting to understand, but could you explain how you counted 15? Depending the logic that I follow accounting for various shapes, I count 9 or 33 but not 15.. ?
Thank you for your answer! The cycles are:
(1,1)(2,1)(1,5)(2,5)
(1,1)(3,1)(1,5)(3,5)
(1,1)(4,1)(1,2)(4,2)
(1,1)(4,1)(1,3)(4,3)
(1,1)(4,1)(1,4)(4,4)
(1,1)(4,1)(1,5)(4,5)
(2,1)(3,1)(2,5)(3,5)
(2,1)(4,1)(2,5)(4,5)
(3,1)(4,1)(3,5)(4,5)
(1,2)(4,2)(1,3)(4,3)
(1,2)(4,2)(1,4)(4,4)
(1,2)(4,2)(1,5)(4,5)
(1,3)(4,3)(1,4)(4,4)
(1,3)(4,3)(1,5)(4,5)
(1,4)(4,4)(1,5)(4,5)
I really appreciate your help!
Please, see the edit in my answer.

Sign in to comment.

 Accepted Answer

Cedric
Cedric on 28 Sep 2013
Edited: Cedric on 5 Nov 2013
== Final answer in comments.
== ANSWER Sun@10:10am
The only easy improvement that I could think of is to eliminate elements which are alone on their row and/or column. It could be significant if the density is low.
[r,c] = find( H ) ;
n = numel( r ) ;
% - Preprocessor: eliminates entries which are "alone" row or column-wise.
v1s = ones(n, 1) ;
rCnt = accumarray( r, v1s ) ; cCnt = accumarray( c, v1s ) ;
isAlone = rCnt(r) == 1 | cCnt(c) == 1 ;
r = r(~isAlone) ; c = c(~isAlone) ; n = numel( r ) ;
% - Main loop.
nCycles = 0 ;
% Loop through potential upper-left corners.
for k_ul = 1 : n-3
if c(k_ul) == n, break ; end
if r(k_ul) == n, continue ; end
% Find and loop through potential lower left corners.
ix_ll = find( c == c(k_ul) & r > r(k_ul) ) ;
if isempty( ix_ll ), continue ; end
for k_ll = ix_ll.'
% Find all potential columns matching upper right corners.
c_ur = c( r == r(k_ul) & c > c(k_ul) ) ;
% Find all potential columns matching row of lower left corner.
c_r_ll = c( r == r(k_ll) & c > c(k_ll) ) ;
% Count cycles as # elements of intersect.
nCycles = nCycles + nnz( bsxfun(@eq, c_ur.', c_r_ll )) ;
end
end
== ANSWER Sat@2:04pm
EDIT Sat@7:30pm : replaced call to INTERSECT (slow) with a solution based on BSXFUN.
I am in a rush and I have to leave, so I don't have time to write explanations or to perform tests, but here is what I guess could be a solution. Have a look and I'll come back later today to discuss it if needed.
[r,c] = find( H ) ;
n = numel( r ) ;
nCycles = 0 ;
% Loop through potential upper-left corners.
for k_ul = 1 : n-3
if c(k_ul) == n, break ; end
if r(k_ul) == n, continue ; end
% Find and loop through potential lower left corners.
ix_ll = find( c == c(k_ul) & r > r(k_ul) ) ;
if isempty( ix_ll ), continue ; end
for k_ll = ix_ll.'
% Find all potential columns matching upper right corners.
c_ur = c( r == r(k_ul) & c > c(k_ul) ) ;
% Find all potential columns matching row of lower left corner.
c_r_ll = c( r == r(k_ll) & c > c(k_ll) ) ;
% Count cycles as # elements of intersect.
nCycles = nCycles + nnz( bsxfun(@eq, c_ur.', c_r_ll )) ;
end
end
On my laptop, counting cycles takes ~1s for a 1e3 x 1e5 sparse matrix filled with a density of 1e-4. There might be a more efficient solution, but if you needed much more efficiency, you should consider a C/MEX based solution.
== INITIAL answer
I will adapt this answer according to what you'll reply to my last comment above, but I'd personally go for a solution based uniquely on non-zero elements. The code below, for example
H = [1 0 0 0 1; ...
1 1 0 1 1; ...
0 1 1 1 0; ...
0 0 0 0 1] ;
[r,c] = find( H ) ;
n = numel( r ) ;
% 1
% - Angles in 1st quadrant: 1 1
%
angles_q1 = zeros( n, 2 ) ;
ia = 0 ;
for k = 1 : n
if r(k) > 1 && c(k) < n
% Check if there are non-zero values above and on the right.
if any( r == r(k) & c == c(k)+1 ) && any( c == c(k) & r == r(k)-1 )
ia = ia + 1 ;
angles_q1(ia,:) = [r(k), c(k)] ;
end
end
end
angles_q1 = angles_q1(1:ia,:) ;
builds an array of angles opened in quadrant I. Running that on the H that you provided outputs
angles_q1 =
2 1
3 2
which gives you the quadrant 1 angles in (2,1) and (3,2).
It should be quite fast even on your large array. The same could be repeated for the other 3 cases, which would generate all the information that you need to "easily" spot rectangles. However, this works only if rectangles can be cut by lines of 1's or interlaced, and it fails if rectangles must be "empty".

11 Comments

Really thanks. Your solution gives correct number of cycles. But i was wrong to the sparse matrices dimensions. The dimensions will be (5*10^4,10^5)! I ran your code for a sparse matrix(5*10^3,10^4) and it took 130sec. Could we do something better? Really thanks for your help!
Cedric
Cedric on 29 Sep 2013
Edited: Cedric on 29 Sep 2013
You're welcome. As mentioned, it's likely that you'll need to go for a C/MEX solution if you want it to be more efficient; I'll think a bit more today and try to improve my answer though (notes 1,2). What is a typical filling density in your setup? Or how many non zero elements do you have in this 5e4x1e5 H matrix ( nnz(H) )?
Note 1: improving the performance would require changing the approach, because the current version of the code is already pretty optimized. What takes 90% of run time is evaluating the following expression
c == c(k_ul) & r > r(k_ul)
which cannot really be optimized further.
Note 2: I'll implement a preprocessor to reduce the size of r and c .. if the density is low enough, it could help.
Please see Sun@10:10am edit.
Thank you very much. Every column has fewer than 30 ones! Moreover, I forgot to say that i have access to a grid, that has 160 proccessors and i can matlabpool 160 to this. For decoding purposes, if a matrix has column_weight=3(every column has exactly 3 ones) and row_weigth=6(every row has exactly 6 ones) the matrix is called regular. In other circumstances, it is called irregular. I thought, if i have a regular(3,6) matrix i can
[m n]=size(H);
parfor i =1:n
varCon(i,:)=find(H(:,i));
end
cycles=0;
parfor ii=1:length(varCon)-1
for jj=2:length(varCon)
if sum(bsxfun(@eq,varCon(ii,:),varCon(jj,:)))==2
cycles=cycles+1;
elseif sum(bsxfun(@eq,varCon(ii,:),varCon(jj,:)))==2
cycles=cycles+3
end
end
end
I think it works. For this case, can i do something better?
Cedric
Cedric on 29 Sep 2013
Edited: Cedric on 29 Sep 2013
LAST EDIT at 12:10pm
Well, I won't have time to fully test or think about your solution, but I think that you could use NNZ on the output of BSXFUN, and add the result to cycles without testing the way you do (same tests in the code that you provided by the way; also, one arg of BSXFUN might have to be transposed).
If you are on a grid, the first thing that comes to my mind as tests of non-zero elements are disjoint, would be to create pools of elements, e.g. (but I have no time to test):
[r,c] = find( H ) ;
n = numel( r ) ;
% - Preprocessor: eliminates entries which are "alone" row or column-wise.
v1s = ones(n, 1) ;
rCnt = accumarray( r, v1s ) ; cCnt = accumarray( c, v1s ) ;
isAlone = rCnt(r) == 1 | cCnt(c) == 1 ;
r = r(~isAlone) ; c = c(~isAlone) ; n = numel( r ) ;
% - Define blocks.
nNodes = 160 ;
blockSize = ceil( (n-3)/nNodes ) ;
blockBnds = 1 : blockSize : n-2 ;
if blockBnds(end) ~= n-2
blockBnds = [blockBnds , n-2] ;
end
nBlocks = numel( blockBnds ) - 1 ;
% - Main loop.
nCycles = zeros( nBlocks, 1 ) ;
parfor blockId = 1 : nBlocks
% Loop through potential upper-left corners.
for k_ul = blockBnds(blockId) : (blockBnds(blockId+1) - 1)
if c(k_ul) == n, break ; end
if r(k_ul) == n, continue ; end
% Find and loop through potential lower left corners.
ix_ll = find( c == c(k_ul) & r > r(k_ul) ) ;
if isempty( ix_ll ), continue ; end
for k_ll = ix_ll.'
% Find all potential columns matching upper right corners.
c_ur = c( r == r(k_ul) & c > c(k_ul) ) ;
% Find all potential columns matching row of lower left corner.
c_r_ll = c( r == r(k_ll) & c > c(k_ll) ) ;
% Count cycles as # elements of intersect.
nCycles(blockId) = nCycles(blockId) + ...
nnz( bsxfun( @eq, c_ur.', c_r_ll )) ;
end
end
end
nCycles = sum( nCycles ) ;
Cedric
Cedric on 30 Sep 2013
Edited: Cedric on 30 Sep 2013
Just an additional thought to check whether your matrix is regular (3,6).
if all( sum(H, 1) == 3 ) && all( sum(H, 2) == 6 )
% the matrix is regular(3,6)
end
Could it regular with other counts, or is it either irregular or regular (3,6)? Also, for your code, are you sure that it doesn't double count rectangles? Shouldn't it be something like
nCols = size( varCon, 1 ) ;
nCycles = zeros( nCols, 1 ) ;
parfor ii = 1 : (nCols-1)
for jj = (ii+1) : nCols
nEqual = nnz( bsxfun( @eq, varCon(ii,:), varCon(jj,:).' )) ;
if nEqual == 2
nCycles(ii) = nCycles(ii) + 1 ;
elseif nEqual == 3
nCycles(ii) = nCycles(ii) + 3 ;
end
end
end
nCycles = sum( nCycles ) ;
Cedric
Cedric on 30 Sep 2013
Edited: Cedric on 30 Sep 2013
Finally, note that if (m,n)-regularity can happen for (m,n) different from (3,6), the code above can be rather easily generalized, as the number of rectangles/cycles for an arbitrary value of nEqual is given by
nCycles = nEqual*(nEqual-1)/2
You would therefore just have to do something like
sumCols = sum( H, 1 ) ;
sumRows = sum( H, 2 ) ;
if all( sumCols == sumCols(1) ) && all( sumRows == sumRows(1) )
fprintf( 'H is (%d,%d) regular.\n', sumCols(1), sumRows(1) ) ;
% .. your code for varCon
nCols = size( varCon, 1 ) ;
nCycles = zeros( nCols, 1 ) ;
parfor ii = 1 : (nCols-1)
for jj = (ii+1) : nCols
nEqual = nnz( bsxfun( @eq, varCon(ii,:), varCon(jj,:).' )) ;
nCycles(ii) = nCycles(ii) + nEqual*(nEqual-1)/2 ;
end
end
nCycles = sum( nCycles ) ;
else
fprintf( 'H is irregular.\n' ) ;
...
end
Really thanks! You helped me a lot! I appreciate this!
You're welcome!
Hello again! You were very helpful for me. I would like to do something more now and i want your opinion. Now, I would like to count 6 length cycles. I completed the code for this. http://www.hindawi.com/journals/jece/2008/354137/fig4/ http://www.hindawi.com/journals/jece/2008/354137/fig5/
parfor blockId = 1 : nBlocks
% Loop through ones of upper-left corners
for k_ul = blockBnds(blockId) : (blockBnds(blockId+1) - 1)
% Find and loop through ones of lower left corners
ix_ll = find( c == c(k_ul) & r > r(k_ul) ) ;
if isempty( ix_ll ), continue ; end
for k_ll = ix_ll.'
% Find all columns matching upper right corners
c_ur = c( r == r(k_ul) & c > c(k_ul) ) ;
if isempty( c_ur ), continue ; end
% Find all potential columns matching row of lower left corner.
c_r_ll = c( r == r(k_ll) & c > c(k_ll) ) ;
if isempty( c_r_ll ), continue ; end
% Find after uper right
for c_urr = c_ur.'
r_aur = r( c == c_urr) ;
% Find after lower right
for c_r_lll = c_r_ll.'
r_ar_ll = r( c == c_r_lll ) ;
% Count cycles
tmp = nnz( bsxfun( @eq, r_aur.', r_ar_ll ));
% Keep positions for deleting
if tmp ~= 0
nCycles(blockId) = nCycles(blockId) + tmp ;
pos=[pos,k_ul];
end
end
end
end
end
end
Can I do something better?
Hi, I am traveling right now and I won't have time to answer I guess. Post this as a new question and make a reference to this thread.

Sign in to comment.

More Answers (1)

Rather than spend time looping over all the indices, many of which don't have 1's and are thus a waste of time, why not just get the rows and columns where the 1's live and then just loop over them?
[rows, columns] = find(H);
numberOfPoints = length(rows);
for k1 = 1 : numberOfPoints
row1 = rows(k1);
col1 = columns(k1);
for k2 = k1+1 : numberOfPoints
row2 = rows(k2);
col2 = columns(k2);
for k3 = k2+1 : numberOfPoints
row3 = rows(k3);
col3 = columns(k3);
for k4 = k3+1 : numberOfPoints
row4 = rows(k4);
col4 = columns(k4);
etc.
or something like that...

Asked:

on 27 Sep 2013

Edited:

on 5 Nov 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!