MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Speed up a "logical matrix multiply"

Asked by the cyclist on 17 Apr 2011

I am struggling to speed up the following nested loop:

```% Create two random logical arrays with a few trues
A = randn(5) > 0.8;
B = randn(5) > 0.8;
```
```% Preallocate result
C = false(5);
```
```% Do the logical "matrix multiply". Slow loops!
for ii=1:5
for jj=1:5
C(ii,jj) = any(A(ii,:)&B(:,jj)');
end
end
```
```% Equivalent, but more memory for the interim doubles
C2 = (double(A)*double(B)) > 0;
```

This is analogous to matrix multiplication, but for logical arrays. "C" is true where non-zero values would be in the matrix multiplication. I have shown the equivalent calculation of "C2" to illustrate this. This is faster, but uses more memory, which is bad when the arrays get large.

I appreciate any thoughts on how to speed up the logical version.

1 Comment

the cyclist on 18 Apr 2011

Thanks to everyone who answered. I'm not going to accept any single answer, because they actually all have merit in different situations, and wouldn't want future seekers to dismiss any of them.

Products

No products are associated with this question.

Answer by Matt Fig on 17 Apr 2011

For larger arrays, this should be faster. For even larger arrays, perhaps a transpose of A first would speed things up more. If that is done, the indexing order into A (in loop) would be reversed.

```C2 = false(N);
V = ones(1,N,'single');
for ii=1:N
TMP = A(ii,:).';
C2(ii,:) = any(TMP(:,V)&B);
end
```

For N=200, and both inside a function M-file, I get:

```Elapsed time is 1.934873 seconds. % your code
Elapsed time is 0.616070 seconds.
```

Answer by Teja Muppirala on 17 Apr 2011
```C = full(double(sparse(A))*sparse(B) > 0);
```

Answer by Jos (10584) on 18 Apr 2011

As for the creation of C2 you can do without one conversion

```C2 = (A*double(B)) > 0;
```

Answer by Robert Cumming on 17 Apr 2011

extracting out the A(ii,:) to the ii loop is slightly faster...

```tic
for ii=1:5
temp=A(ii,:);
for jj=1:5
C(ii,jj) = any(temp& B(:,jj)');
end
end
toc
```

Comparing I get:

```Elapsed time is 0.000505 seconds. %(Your code)
Elapsed time is 0.000439 seconds. %(modified loop above)
```

Answer by James Tursa on 17 Apr 2011

Here is a mex approach. I am using the brute force matrix multiply algorithm without paying much attention to memory access patterns other than linear access on the result. Could do an OpenMP implementation of this as well if there is enough interest. Could also take pieces from the engine of my MTIMESX FEX submission to allow for transposes.

```/*---------------------------------------------------------------------*/
/* File: mtimesg.c                                                     */
/* Function: C = mtimesg(A,B)                                          */
/*           A = 2D Logical matrix                                     */
/*           B = 2D Logical matrix                                     */
/*           C = Logical matrix multiply result where * is replaced    */
/*               with AND and + is replaced with OR.                   */
/* Programmer: James Tursa                                             */
/* Date: 17-Apr-2011                                                   */
/*---------------------------------------------------------------------*/
```
```#include "mex.h"
```
```void mexFunction(int nlhs, mxArray *plhs[],int nrhs, const mxArray *prhs[])
{
mxLogical *Apr, *Bpr, *Cpr, *Apr0, *Bpr0;
mwSize i, j, k, l, m, n, p;
```
```	if( nlhs > 1 ) {
mexErrMsgTxt("Too many outputs.");
}
if( nrhs != 2 ||
!mxIsLogical(prhs[0]) || mxGetNumberOfDimensions(prhs[0]) != 2 ||
!mxIsLogical(prhs[1]) || mxGetNumberOfDimensions(prhs[1]) != 2 ) {
mexErrMsgTxt("Need exactly two logical 2D input matrices.");
}
m = mxGetM(prhs[0]);
k = mxGetN(prhs[0]);
l = mxGetM(prhs[1]);
n = mxGetN(prhs[1]);
if( k != l ) {
mexErrMsgTxt("Inner matrix dimensions must agree.");
}
Apr0 = Apr = mxGetLogicals(prhs[0]);
Bpr0 = Bpr = mxGetLogicals(prhs[1]);
plhs[0] = mxCreateLogicalMatrix(m,n);
Cpr = mxGetLogicals(plhs[0]);
for( j=0; j<n; j++ ) {
for( i=0; i<m; i++ ) {
Apr = Apr0 + i;
Bpr = Bpr0;
for( p=0; p<k; p++ ) {
if( *Apr && *Bpr ) {
*Cpr = 1;
break; /* short circuit if answer known */
}
Apr += m;
Bpr++;
}
Cpr++;
}
Bpr0 += k;
}
}
```

To mex this type the following at the MATLAB command line prompt:

mex -setup

(Then press Enter)

(Then enter the number of a C compiler such as lcc)

(Then press Enter again)

mex mtimesg.c