19 views (last 30 days)

Dear all,

I was wondering if it is possible to increase the performance of a matrix-free product of a sparse matrix defined by 3 vectors (rows, columns and values) with another vector {B}. That is:

A(l,m)=v and performing {C}=[A]*{B}

So far I have the following strategy:

function C = mfree_times(l,m,v,B)

aux = B(m);

prod = v.*aux;

C = accumarray(l,prod);

end

My intentions are to increase this performance as much as possible. The bottleneck is majorly given by accumarray function. I am already using l, m and v as gpuArrays. Please share your thoughts. Thanks!

James Tursa
on 6 Sep 2018

Edited: James Tursa
on 6 Sep 2018

Assuming the vectors are 3-tuples of row, column, and value, and that all of the variables involved are full double vectors, here is a naive mex routine to do the multiplication and produce a full vector result. If you are working with R2018a and compiling with the -R2018a option then replace the mxGetPr calls with mxGetDoubles calls.

/* C = mfree_timesx(r,c,v,B) does the equivalent of:

*

* C = sparse(r,c,v,numel(B),numel(B)) * B

*

* where

*

* r,c,v are 3-tuples of values from an MxM matrix:

* r = row index (1-based)

* c = column index (1-based)

* v = values

* B = column vector of size Mx1

* C = column vector of size Mx1

*

* It is assumed that the indexing contained in the r and c vectors are

* all within an MxM square matrix (i.e., are integers between 1 and M).

* (The code does *not* check for this. A violation will bomb this routine)

*

* Duplicate indexing within the r and c vectors is allowed, and is handled

* by simply adding the results together (same as what would happen if you

* did C = sparse(r,c,v,numel(B),numel(B)) * B).

*

* All inputs must be full real double vectors.

*

* The output is a full real double column vector.

*

* Programmer: James Tursa

*

*/

/* Includes ----------------------------------------------------------- */

#include "mex.h"

/* Gateway ------------------------------------------------------------ */

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])

{

mwSize i, n, m;

double *Ar, *Ac, *Av, *B, *C;

if( nrhs != 4 ) {

mexErrMsgTxt("Need exactly four double input vectors");

}

if( nlhs > 1 ) {

mexErrMsgTxt("Too many outputs");

}

if( !(mxIsDouble(prhs[0]) && mxIsDouble(prhs[1]) && mxIsDouble(prhs[2]) && mxIsDouble(prhs[3])) ||

mxIsComplex(prhs[0]) || mxIsComplex(prhs[1]) || mxIsComplex(prhs[2]) || mxIsComplex(prhs[3]) ||

mxIsSparse(prhs[0]) || mxIsSparse(prhs[1]) || mxIsSparse(prhs[2]) || mxIsSparse(prhs[3]) ) {

mexErrMsgTxt("All inputs must be full real double vectors");

}

n = mxGetNumberOfElements(prhs[0]);

if( mxGetNumberOfElements(prhs[1]) != n || mxGetNumberOfElements(prhs[2]) != n ) {

mexErrMsgTxt("The first three inputs must be the same size");

}

m = mxGetNumberOfElements(prhs[3]);

if( mxGetM(prhs[3]) != m ) {

mexErrMsgTxt("The last input must be a column vector");

}

Ar = mxGetPr(prhs[0]);

Ac = mxGetPr(prhs[1]);

Av = mxGetPr(prhs[2]);

B = mxGetPr(prhs[3]);

plhs[0] = mxCreateDoubleMatrix(m,1,mxREAL);

C = mxGetPr(plhs[0]);

for( i=0; i<n; i++ ) {

C[(mwSize)*Ar++ - 1] += *Av++ * B[(mwSize)*Ac++ - 1];

}

}

Some sample test code:

N = 5000;

r = ceil(N*rand(N*N,1));

c = ceil(N*rand(N*N,1));

v = rand(N*N,1);

B = rand(N,1);

disp(' ')

disp('mex routine')

tic

x = mfree_timesx(r,c,v,B);

toc

disp('sparse')

tic

s = sparse(r,c,v,numel(B),numel(B))*B;

toc

disp('accumarray')

tic

aux = B(c);

prod = v.*aux;

C = accumarray(r,prod);

toc

disp(' ')

disp('mex vs sparse')

max(abs(x-s))

disp('mex vs accumarray')

max(abs(x-C))

disp('sparse vs accumarray')

max(abs(s-C))

disp(' ')

and results:

mex routine

Elapsed time is 0.150372 seconds.

sparse

Elapsed time is 5.806035 seconds.

accumarray

Elapsed time is 0.666030 seconds.

mex vs sparse

ans =

1.2960e-011

mex vs accumarray

ans =

0

sparse vs accumarray

ans =

1.2960e-011

James Tursa
on 6 Sep 2018

Yes, that small loop does all of the real work. Everything else is just overhead. If you want I can add in code to check the individual r and c indexes to make sure they are legit.

Also, if these variables are really mxGPUArray variables, then you will need to replace all of the mxETC( ) functions with their mxGPU( ) counterparts. E.g., use (double *)mxGPUGetData(etc) instead of mxGetPr(etc).

Sign in to comment.

Bruno Luong
on 5 Sep 2018

Can you give the detail about how you iterate ?

There seems no room to improve the basic block mfree_times

If you output is sparse and you add them, and internally it requires new memory allocation and merge sorted array.

Better building all collection triplets then call ACCUMARRAY once.

Bruno Luong
on 6 Sep 2018

It seems index complement of "l" has no true impact on CG algorithm. If it is a small vector with large value, you might want to squeeze as consecutive vector (using UNIQUE command) and work on the subspace of the new value.

That prevent CG work unnecessary large vector/matrix.

Then after converging, you can recover original index of the solution if that is matter to use the result.

Bruno Luong
on 6 Sep 2018

It seems that your problem is memory and not inneficient calculation.

If the matrix is sparse (a lot of zero) you should consider my suggestion if squeezing subspace.

By do try to reduce memory by using storage the indexes with INT32, matrix values with SINGLE, or using TALL ARRAY for more efficient HD cache.

Sign in to comment.

Sign in to answer this question.

Opportunities for recent engineering grads.

Apply Today
## 10 Comments

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_606965

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_606965

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_606970

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_606970

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607001

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607001

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607008

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607008

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607019

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607019

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607026

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607026

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607039

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607039

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607046

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607046

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607413

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607413

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607524

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/417540-a-fast-way-to-perform-sparse-matrix-free-product-with-a-vector#comment_607524

Sign in to comment.