File Exchange

Binary Serach of rows of M-by-N Matrix

version 1.1.0.0 (23.2 KB) by Dr. Murtaza Khan

Dr. Murtaza Khan (view profile)

Binary Search that is based on sum of column values of rows of input M-by-N Matrix

Updated 09 Jul 2009

Binary Serach of rows of M-by-N Matrix
Search is based on sum of column values of rows of input M-by-N Matrix
i.e. Rows of input Matrix must already be sorted in ascending order
based on sum of their column values. i.e.
row1sum<row2sum<...rowjsum<...rowmsum
where rowjsum=sum(mat(j,:),2) (i.e sum of column values)

If mat is not sorted use following two statments to sort rows of mat
according to sum of column values
matwithsum=sortrows([sum(mat,2),mat],1);
mat=matwithsum(:,2:end);

INPUT:
mat: Matrix contains values such as
[x1, y1,..., zn;
x2, y2,..., zn;
...
xm, ym,..., zn];

srow: row that is to be searched in mat

OUTPUT:
Using following different value of parameter 'type' following output can be return
1. 'first': returns 'indvec1' i.e index of first occured row in mat that matches to srow.

2. 'all': returns 'indvec1' i.e indices of all rows in mat that matches to srow.

3. 'last': returns 'indvec1' i.e index of last row in mat that matches to srow.

4. 'allcolsum': returns 'indvec1' i.e indices of all k rows of mat such that
sum(mat(j,:),2)=sum(srow,2),
where 1<=j<=k

5. 'colsumandval': return two index vector first is similar to return by "allcolsum" and second is similiar to return by "all' option.

Note that if we are searching e.g. [5 4 2] by option 'first' then there may be many rows whose column sum is equal to given row e.g. [2 4 5],[0 0 11],[4 2 5], [5 4 2] but only
last row i.e. [5,4,2] matches in all columns to given row to be searched.
-------------------------
bsearchmatrows.m (main program)
bsearchmatrowsTest1.m (simple test program)
bsearchmatrowsTest2.m (compare performance with MATLAB builtin metod index=find(ismember(mat,srow, 'rows'));
-------------------
For large data this search is extremely faster than MATLAB built in methods

Thomas Trötscher

Thomas Trötscher (view profile)

This does not work well when there are many rows with the same column sum. I have a matrix of size ~45000 x 350 where the rows can have a sum of 1, 2 or 3; in this case the supplied function used about 100 times as long as the built-in matlab function. Also, if the same data is transposed the function does not work at all.

Other than that it seems to work very well for uniformly distributed data.

woonsang cho

woonsang cho (view profile)

sorry I realized that it should be ordered by the sum. thank you

woonsang cho

woonsang cho (view profile)

It works for small matrices, but it does not work for some larger matrices. Please run and check the following statement:

bsearchmatrows([5 6; 2 3 ; 1 2 ; 1 1; 0 0],[5 6])

ans =
[]

Obviously it should return '1', but it doesn't. Please help us fix this bug. Thank you,

Amfy J.

Very useful for searching rows in large 2-D matrices.

M A Khan

in reply to Jos van der Geest
-----------------------------
if you are seraching small data set (e.g. matrix of few hundrend rows) then you will not feel the difference. But for large data set in thoudsands of rows then this is definitley faster then MATLAB builtin method "find(ismember(mat,srow, 'rows'))". It is easy to understand, MATLAB ismember method first sort the data, the sorting complexity of fastest algorithm merge sort/quick sort is O(nlogn) then ismember serach the data. Serach complexity of binary serach is O(logn) therefore if you use MATLAB builtin method the overall seraching complexity would be O(nlogn)+O(logn)=O(nlogn). If the data is already sorted then there is not need to sort it again, but if you use MATLAB buitin method it spend time to sort. But if you use the "bsearchmatrows" it will not sort the data , and searching complexity would be limit to O(logn). I have tested this program tens of times (see bsearchmatrowsTest2.m ),it always faster then MATLAB builtin methods.
-----------------------------
check again the program bsearchmatrowsTest2.m (as the data size grow , differece of peroframance goes in favour of bsearchmatrows.m)

Jos van der Geest

This is not faster than the proper use of ismember(M,srow,'rows').