## Find row in matrix, fast and with special conditions

### yosey (view profile)

on 25 May 2013
Latest activity Commented on by Mark

on 13 Mar 2015

### Jan Simon (view profile)

I have a matrix (S) and one row out of that matrix (s). I want to get the rownumber of (s) in (S).

(S) has unique rows. The row (s) is allways member of (S)-rows.

So far I calculate the row number with:

```I = find(all(bsxfun(@eq,S,s),2));
```

I am looking for a faster method. I call this line alot, it costs me hours of calculation time.

Here's a test example:

```% create example, N and n could change
n = 4;  % n < 10
N = 25; % range(N) ~ [20 100]
S = nchoosek(1:N,n);
```
```for i = 1:10
s = S(randi(size(S,1)),:);
```
```    tic
I = find(all(bsxfun(@eq,S,s),2));
toc
end```

[Edit2], poorly expressed "(S) has unique rows". It should express that I is allways scalar.

, dimensional details.

My example shows the general direction of the sizes.

```[d1 d2] = size(S); % its [12650x4] in my example
```

First dimension will be much larger then the second one. (d1 >> d2)

Dimension two (d2=n) will be smaller then 10. Numel(S) will be limited by Memory.

Jan Simon

### Jan Simon (view profile)

on 25 May 2013

What is the usual dimension of S? It matters if you are talking about 10x4, 10'000x20 or 20x10'000 matrices.

## Products

No products are associated with this question.

### Jan Simon (view profile)

on 25 May 2013
Edited by Jan Simon

### Jan Simon (view profile)

on 25 May 2013

If the first dimension of S is large, the BSXFUN approach checks of a lot of numbers although they are rejected in the former columns already. A small Matlab function can avoid testing already rejected rows:

```function Index = FindRow(S, s)
nCol  = size(S, 2);
match = S(:, 1) == s(1);
for col = 2:nCol
match(match) = S(match, col) == s(col);
end
Index = find(match);
```

For Image Analyst's example under Matlab 2009a/64/Win7:

```S = randi(3, 200, 3);
s = [2 3 1];
```
```tic, for k=1:1000; m = find(all(bsxfun(@eq,S,s),2)); end; toc
tic; for k=1:1000; m = find(ismember(S, s, 'rows')); end; toc
tic; for k=1:1000; m = FindRow(S, s); end; toc
```
```% 0.032688 sec
% 0.433527 sec
% 0.021643 sec
```

But for S = randi(30, 2000, 20), s=S(1000, :):

```% 0.124566 sec
% 3.175940 sec
% 0.194101 sec
```

[EDITED] It is easier in C to stop the search after the first matching row is found. This saves 50% processing time in the average. In addition no temporary arrays are required:

```#include "mex.h"
```
```void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double *S, *s;
mwSize n1, n2, i1, i2, k;
```
```    S  = mxGetPr(prhs[0]);
n1 = mxGetM(prhs[0]);
n2 = mxGetN(prhs[0]);
s  = mxGetPr(prhs[1]);  ```
```    for (i1 = 0; i1 < n1; i1++) {
k = i1;
for (i2 = 0; i2 < n2; i2++) {
if (S[k] != s[i2]) {
break;
}
k += n1;
}```
```       if (i2 == n2) {  // Matching row found:
plhs[0] = mxCreateDoubleScalar((double) (i1 + 1));
return;
}
}```
```    // No success:
plhs[0] = mxCreateDoubleScalar(mxGetNaN());
}```

For S = randi(30, 2000, 20) I get (compiled by MSVC2008):

```BSXFUN:      0.123305 sec
FindRow.m:   0.194914 sec
ISMEMBER:    3.172425 sec
FindRow.mex: 0.011098 sec (s is last row)
0.007880 sec (s is 1000th row)
0.004753 sec (s is first row)
```

The MEX has an average speedup of factor 15. For a [20000 x 200] matrix we get a factor of 260 already.

yosey

### yosey (view profile)

on 26 May 2013

Thanks a lot for your support. This sounds great. I will need some time to play around with it. I never worked with either C or MEX.

: Ok, I get a speedup factor of 3. Thats good, saves me a lot of cpu time and speeds up my overall code by 30%.

Jan Simon

### Jan Simon (view profile)

on 27 Dec 2013

Dear Jacob L,

The request you've sent me by email would have been a perfect question for this forum. Therefore I answer here, how to modify this function to work for UINT8 arrays:

```#include "mex.h"
```
```void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
uint8_T *M, *v;
mwSize  m, n, i1, i2, k;
```
```    if (nrhs != 2 || !mxIsUint8(prhs[0]) || !mxIsUint8(prhs[1])) {
"2 UINT8 arrays required as input.");
}```
```    M = (uint8_T *) mxGetData(prhs[0]);
m = mxGetM(prhs[0]);
n = mxGetN(prhs[0]);
v = (uint8_T *) mxGetData(prhs[1]);  ```
```    if (mxGetNumberOfElements(prhs[1]) != n) {
"Sizes of inputs do not match.");
}```
```    for (i1 = 0; i1 < m; i1++) {
k = i1;
for (i2 = 0; i2 < n; i2++) {
if (M[k] != v[i2]) {
break;
}
k += m;
}```
```       if (i2 == n) {  // Matching row found:
plhs[0] = mxCreateDoubleScalar((double) (i1 + 1));
return;
}
}```
```    // No success:
plhs[0] = mxCreateDoubleMatrix(0, 0, mxREAL);
}```
Mark

### Mark (view profile)

on 13 Mar 2015

Hi,

I used your your original mex file in matlab and tried to change it. I'm also searching for fast ways to find a row matrix s in a large matrix S. However the row matrix s appears a number of times in large matrix S.

```if true
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[], mwSize, lastfound)
{
double *S, *s *lastfound;
mwSize n1, n2, i1, i2, k;
S  = mxGetPr(prhs[0]);
n1 = mxGetM(prhs[0]);
// n2 = mxGetN(prhs[0]);
n2 = 3;
s  = mxGetPr(prhs[1]);
for (i1 = lastfound; i1 < n1; i1++) {
k = i1;
for (i2 = 0; i2 < n2; i2++) {
if (S[k] != s[i2]) {
break;
}
k += n1;
}
if (i2 == n2) {  // Matching row found:
plhs[0] = mxCreateDoubleScalar((double) (i1 + 1));
return;
}
}
// No success:
plhs[0] = mxCreateDoubleScalar(mxGetNaN());
}
end
```

So i tried to use another input 'lastfound' which would enable the code to start at lastfound + 1 (the +1 is added when the function is called), so that the previous row s is found, and the code start searching the second row s. I do this until the mex file return NaN. It however doesn't work, the code doesn't start the at the lastfound + 1 step. I am a C noob, so i think it is a very small error, but i would immensely appreciate it if somebody could take a look at it.

### Image Analyst (view profile)

on 25 May 2013

Try this:

```% Generate some sample data:
m = randi(3, 200, 3)
% Specify what we're looking for:
lookingFor = [2 3 1]; % Let's say we're looking for this
% Find out what rows that pattern occurs in:
rowsItOccursIn = find(ismember(m, lookingFor, 'rows'))
```

yosey

### yosey (view profile)

on 25 May 2013

I've allready tryed ismember and wondered myself how slow it is (for this task).

Image Analyst

### Image Analyst (view profile)

on 25 May 2013

What are the actual sizes of n, N, and S? For 4 and 25, it DEFINITELY should not take "hours of calculation time" like you said.

yosey

### yosey (view profile)

on 26 May 2013

I've added dimensional details. High cpu time results due to multiple calls.

### Teja Muppirala (view profile)

on 26 May 2013

Not sure if this is faster, but might be worth a try.

```d1 = 12650;
d2 = 4;
S = randn(d1,d2);
s = S(randi(d1),:);
c = 1;
r = 1;
```
```while c <= d2
if S(r,c) == s(c)
c = c+1;
else
r = r+1;
c = 1;
end
end
isequal(S(r,:),s)
```

If the big S is the same every time, you could make a better algorithm by sorting it in advance.

yosey

### yosey (view profile)

on 26 May 2013

The while loop is pretty fast, but cant beat the mex file. Here are the times for 1000 loops.

```Elapsed time is 0.143702 seconds. % find(all(bsxfun(@eq,S,s),2))
Elapsed time is 0.059386 seconds. % findrow_mex
Elapsed time is 0.118347 seconds. % your while loop
```

Since I've got some trouble reading the C-code, isn't the mex-code the same as your while loop?

#### Join the 15-year community celebration.

Play games and win prizes!

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi