MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

# Find row in matrix, fast and with special conditions

Asked by yosey on 25 May 2013
Latest activity Commented on by Jan Simon on 27 Dec 2013

I have a matrix (S) and one row out of that matrix (s). I want to get the rownumber of (s) in (S).

(S) has unique rows. The row (s) is allways member of (S)-rows.

So far I calculate the row number with:

```I = find(all(bsxfun(@eq,S,s),2));
```

I am looking for a faster method. I call this line alot, it costs me hours of calculation time.

Here's a test example:

```% create example, N and n could change
n = 4;  % n < 10
N = 25; % range(N) ~ [20 100]
S = nchoosek(1:N,n);
```
```for i = 1:10
s = S(randi(size(S,1)),:);
```
```    tic
I = find(all(bsxfun(@eq,S,s),2));
toc
end```

[Edit2], poorly expressed "(S) has unique rows". It should express that I is allways scalar.

, dimensional details.

My example shows the general direction of the sizes.

```[d1 d2] = size(S); % its [12650x4] in my example
```

First dimension will be much larger then the second one. (d1 >> d2)

Dimension two (d2=n) will be smaller then 10. Numel(S) will be limited by Memory.

## 1 Comment

Jan Simon on 25 May 2013

What is the usual dimension of S? It matters if you are talking about 10x4, 10'000x20 or 20x10'000 matrices.

## Products

No products are associated with this question.

Answer by Jan Simon on 25 May 2013
Edited by Jan Simon on 25 May 2013

If the first dimension of S is large, the BSXFUN approach checks of a lot of numbers although they are rejected in the former columns already. A small Matlab function can avoid testing already rejected rows:

```function Index = FindRow(S, s)
nCol  = size(S, 2);
match = S(:, 1) == s(1);
for col = 2:nCol
match(match) = S(match, col) == s(col);
end
Index = find(match);
```

For Image Analyst's example under Matlab 2009a/64/Win7:

```S = randi(3, 200, 3);
s = [2 3 1];
```
```tic, for k=1:1000; m = find(all(bsxfun(@eq,S,s),2)); end; toc
tic; for k=1:1000; m = find(ismember(S, s, 'rows')); end; toc
tic; for k=1:1000; m = FindRow(S, s); end; toc
```
```% 0.032688 sec
% 0.433527 sec
% 0.021643 sec
```

But for S = randi(30, 2000, 20), s=S(1000, :):

```% 0.124566 sec
% 3.175940 sec
% 0.194101 sec
```

[EDITED] It is easier in C to stop the search after the first matching row is found. This saves 50% processing time in the average. In addition no temporary arrays are required:

```#include "mex.h"
```
```void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double *S, *s;
mwSize n1, n2, i1, i2, k;
```
```    S  = mxGetPr(prhs[0]);
n1 = mxGetM(prhs[0]);
n2 = mxGetN(prhs[0]);
s  = mxGetPr(prhs[1]);  ```
```    for (i1 = 0; i1 < n1; i1++) {
k = i1;
for (i2 = 0; i2 < n2; i2++) {
if (S[k] != s[i2]) {
break;
}
k += n1;
}```
```       if (i2 == n2) {  // Matching row found:
plhs[0] = mxCreateDoubleScalar((double) (i1 + 1));
return;
}
}```
```    // No success:
plhs[0] = mxCreateDoubleScalar(mxGetNaN());
}```

For S = randi(30, 2000, 20) I get (compiled by MSVC2008):

```BSXFUN:      0.123305 sec
FindRow.m:   0.194914 sec
ISMEMBER:    3.172425 sec
FindRow.mex: 0.011098 sec (s is last row)
0.007880 sec (s is 1000th row)
0.004753 sec (s is first row)
```

The MEX has an average speedup of factor 15. For a [20000 x 200] matrix we get a factor of 260 already.

yosey on 26 May 2013

Thanks a lot for your support. This sounds great. I will need some time to play around with it. I never worked with either C or MEX.

: Ok, I get a speedup factor of 3. Thats good, saves me a lot of cpu time and speeds up my overall code by 30%.

Jan Simon on 27 Dec 2013

Dear Jacob L,

The request you've sent me by email would have been a perfect question for this forum. Therefore I answer here, how to modify this function to work for UINT8 arrays:

```#include "mex.h"
```
```void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
uint8_T *M, *v;
mwSize  m, n, i1, i2, k;
```
```    if (nrhs != 2 || !mxIsUint8(prhs[0]) || !mxIsUint8(prhs[1])) {
"2 UINT8 arrays required as input.");
}```
```    M = (uint8_T *) mxGetData(prhs[0]);
m = mxGetM(prhs[0]);
n = mxGetN(prhs[0]);
v = (uint8_T *) mxGetData(prhs[1]);  ```
```    if (mxGetNumberOfElements(prhs[1]) != n) {
"Sizes of inputs do not match.");
}```
```    for (i1 = 0; i1 < m; i1++) {
k = i1;
for (i2 = 0; i2 < n; i2++) {
if (M[k] != v[i2]) {
break;
}
k += m;
}```
```       if (i2 == n) {  // Matching row found:
plhs[0] = mxCreateDoubleScalar((double) (i1 + 1));
return;
}
}```
```    // No success:
plhs[0] = mxCreateDoubleMatrix(0, 0, mxREAL);
}```
Answer by Image Analyst on 25 May 2013

Try this:

```% Generate some sample data:
m = randi(3, 200, 3)
% Specify what we're looking for:
lookingFor = [2 3 1]; % Let's say we're looking for this
% Find out what rows that pattern occurs in:
rowsItOccursIn = find(ismember(m, lookingFor, 'rows'))
```

yosey on 25 May 2013

I've allready tryed ismember and wondered myself how slow it is (for this task).

Image Analyst on 25 May 2013

What are the actual sizes of n, N, and S? For 4 and 25, it DEFINITELY should not take "hours of calculation time" like you said.

yosey on 26 May 2013

I've added dimensional details. High cpu time results due to multiple calls.

Answer by Teja Muppirala on 26 May 2013

Not sure if this is faster, but might be worth a try.

```d1 = 12650;
d2 = 4;
S = randn(d1,d2);
s = S(randi(d1),:);
c = 1;
r = 1;
```
```while c <= d2
if S(r,c) == s(c)
c = c+1;
else
r = r+1;
c = 1;
end
end
isequal(S(r,:),s)
```

If the big S is the same every time, you could make a better algorithm by sorting it in advance.

## 1 Comment

yosey on 26 May 2013

The while loop is pretty fast, but cant beat the mex file. Here are the times for 1000 loops.

```Elapsed time is 0.143702 seconds. % find(all(bsxfun(@eq,S,s),2))
Elapsed time is 0.059386 seconds. % findrow_mex
Elapsed time is 0.118347 seconds. % your while loop
```

Since I've got some trouble reading the C-code, isn't the mex-code the same as your while loop?