How to quickly find the column index of the last non-zero element in all rows in a sparse matrix?

Question

Benson Gou on 19 Mar 2020

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix

Edited: Ameer Hamza on 22 Mar 2020

Hi, All,

I have a big sparse matrix A. I want to find out the column index of the Last non-zero element in all rows from the end. Here is my code:

reorderRow = [];

for jRow = 1 : length(A(:,1))

eee = find(A(jRow,:),1,'last');

reorderRow = [reorderRow eee];

end

For example, I have matrix A = [2 0 0;0 5 0;0 0 1;0 3 0;-1 0 -4]. My code gives the following result:

reorderRow = [1 2 3 2 3];

I am wondering if it is possible to obtain reorderRow without iterations. Thanks a lot.

Benson

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

BobH on 19 Mar 2020

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix#answer_420980

Edited: BobH on 19 Mar 2020

Open in MATLAB Online

reorderRow = arrayfun(@(X)  find(A(X,:),1,'last'), 1:size(A,1))
reorderRow =
     1     2     3     2     3

4 Comments
Show 2 older commentsHide 2 older comments

Ameer Hamza on 19 Mar 2020

Benson, I think you accepted the wrong answer. John's answer is under this answer.

BobH on 19 Mar 2020

Yes, Benson, please re-select John's answer

Sign in to comment.

Answer 2

John D'Errico on 19 Mar 2020

4
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix#answer_420988

Edited: John D'Errico on 19 Mar 2020

Open in MATLAB Online

It is important that if your matrix is sparse, that we use sparse matrix examples to test anything out.

Regardless, I would always do a comparison of other methods to a direct find. For example...

A = sprand(10000,10000,.0005);

So, on average, 5 elements per row. If some rows have no non-zeros, then this will return zero as the index of the last element.

tic
[ir,ic] = find(A);
maxcolind = accumarray(ir,ic,[10000,1],@max);
toc
Elapsed time is 0.026327 seconds.

This will not fail, even if some rows are empty of non-zeros.

sum(maxcolind == 0)
ans =
    59

The random example I created had 59 rows that were entirely zero.

If you want to find the array reorderRow as you asked for, then just do this:

r = find(maxcolind);
reorderRow = [r,maxcolind(r)];

1 Comment
Show -1 older commentsHide -1 older comments

Benson Gou on 19 Mar 2020

Dear John,

Thanks a lot. Your code took 0.002112 s for a sparse matrix 974 x 632. It works well.

Benson

Sign in to comment.

Answer 3

James Tursa on 19 Mar 2020

3
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix#answer_420993

Edited: James Tursa on 19 Mar 2020

Open in MATLAB Online

You can use a mex routine for this and avoid all temporary data copies. E.g., a direct approach:

/*   File:  find_last_nz.c
 *
 *   C = find_last_nz(S)
 *
 *   Finds column number of last non-zero element in each row of sparse matrix
 *   S = sparse matrix (double or logical)
 *   C = full column vector containing the columns numbers for each row
 *
 *   Programmer: James Tursa
 */
/* Includes ----------------------------------------------------------- */
#include "mex.h"
/* Gateway ------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    double *Cpr;
    mwIndex *Ir, *Jc;
    size_t i, j, imax, m, n;
    
    if( nlhs > 1 ) {
        mexErrMsgTxt("Too many outputs.");
    }
    if( nrhs != 1 || !mxIsSparse(prhs[0]) ) {
        mexErrMsgTxt("Need exactly one sparse input.");
    }
    Ir = mxGetIr(prhs[0]);
    Jc = mxGetJc(prhs[0]);
    m = mxGetM(prhs[0]);
    n = mxGetN(prhs[0]);
    plhs[0] = mxCreateDoubleMatrix(m,1,mxREAL);
    Cpr = (double *) mxGetData(plhs[0]);
    for( j=0; j<n; j++ ) {
        imax = Jc[j+1] - Jc[j];
        for( i=0; i<imax; i++ ) {
            Cpr[*Ir++] = j+1;
        }
    }
}

And a sample run:

>> mex find_last_nz.c
Building with 'Microsoft Windows SDK 7.1 (C)'.
MEX completed successfully.
>> A = rand(10).*(rand(10)<.2)
A =
    0.6443         0    0.3111         0         0         0    0.0377         0         0         0
         0         0         0         0         0         0         0         0         0         0
    0.8116         0         0    0.6028    0.8010         0         0    0.4942         0         0
    0.5328         0         0         0         0         0         0         0         0         0
         0         0    0.9049    0.2217         0         0    0.0987         0         0         0
    0.9390         0         0    0.1174         0         0         0    0.9037         0    0.1679
         0         0         0         0         0    0.6791         0         0         0         0
         0         0         0    0.3188         0    0.3955         0         0         0         0
         0    0.2277         0         0         0         0    0.1366         0         0    0.5005
    0.5870    0.4357         0         0         0    0.9880         0         0    0.5767    0.4711
>> find_last_nz(sparse(A))
ans =
     7
     0
     8
     1
     7
    10
     6
     6
    10
    10

What is presented above is a one-pass approach that only looks at the internal sparse indexing arrays.

It is possible that it might be faster to transpose the matrix first so that I could pick off the column numbers directly from the internal arrays, but that would require a data copy and I haven't written and tested this approach. That is, picking off the last non-zero row number for each column would be an extremely fast operation in a mex routine, but for your case the matrix would have to be transposed first which would probably dominate the run-time so I haven't bothered to write and test it.

EDIT

A timing comparison with the m-code method posted by John:

>> A = sprand(10000,10000,.001);
>> 
>> tic;[ir,ic] = find(A);maxcolind = accumarray(ir,ic,[10000,1],@max);toc
Elapsed time is 0.029285 seconds.
>> tic;[ir,ic] = find(A);maxcolind = accumarray(ir,ic,[10000,1],@max);toc
Elapsed time is 0.030831 seconds.
>> tic;[ir,ic] = find(A);maxcolind = accumarray(ir,ic,[10000,1],@max);toc
Elapsed time is 0.028965 seconds.
>> 
>> tic;C=find_last_nz(A);toc
Elapsed time is 0.000260 seconds.
>> tic;C=find_last_nz(A);toc
Elapsed time is 0.000253 seconds.
>> tic;C=find_last_nz(A);toc
Elapsed time is 0.000278 seconds.
>> 
>> isequal(maxcolind,C)
ans =
  logical
   1

So, the mex routine is about 100x faster for this particular example.

6 Comments
Show 4 older commentsHide 4 older comments

Matt J on 20 Mar 2020

Benson, you just need to copy the Matt's code and paste it in a file named find_last_nz.c. Then run

James's code, I think you mean.

Benson Gou on 22 Mar 2020

Dear Matt,

Good morning!

Thanks a lot for your great help. You have a good weekend!

Benson

Sign in to comment.

Answer 4

Ameer Hamza on 19 Mar 2020

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix#answer_420981

Edited: Ameer Hamza on 19 Mar 2020

Open in MATLAB Online

For starters, do preallocation, It can speed up the execution if the matrix A has many rows. The following code compares the execution time for your code, BobH's code, and the code with pre-allocation. I intentionally made the A matrix with 10000 rows to show the difference

A = randi([1 5], 10000, 3);
% your code
tic
reorderRow = [];
for jRow = 1 : length(A(:,1))
    eee = find(A(jRow,:),1,'last');
    reorderRow = [reorderRow eee];
end
toc
% arrayfun method
tic
reorderRow = arrayfun(@(X)  find(A(X,:),1,'last'), 1:length(A));
toc
% pre-allocating the array
tic
reorderRow = zeros(1, size(A,1));
for jRow = 1 : length(A(:,1))
    eee = find(A(jRow,:),1,'last');
    reorderRow(jRow) = eee;
end
toc

Result

Elapsed time is 0.068404 seconds. % your code (slowest)
Elapsed time is 0.048778 seconds. % array fun (medium)
Elapsed time is 0.015521 seconds. % pre-allocation (fastest)

Pre-allocation is about 3.5 times faster than the original code.

5 Comments
Show 3 older commentsHide 3 older comments

Ameer Hamza on 19 Mar 2020

Open in MATLAB Online

John, Thanks for pointing out. I noticed that before, but all of these routines are essentially calling the same functions find(A(jRow,:),1,'last') with similar options, albeit differently. So, I thought this should be a fair comparison. If there is some other method that does not involve find(), than sparsity might play a role in comparison.

I tried with the sparse matrix as given by OP, and results appear to be similar.

A = [2 0 0;0 5 0;0 0 1;0 3 0;-1 0 -4];
A = repmat(A, 2000, 1); % <--- simply repeated the matrix given by OP so have same sparsity
% your code
tic
reorderRow = [];
for jRow = 1 : length(A(:,1))
    eee = find(A(jRow,:),1,'last');
    reorderRow = [reorderRow eee];
end
toc
% arrayfun method
tic
reorderRow = arrayfun(@(X)  find(A(X,:),1,'last'), 1:length(A));
toc
% pre-allocating the array
tic
reorderRow = zeros(1, size(A,1));
for jRow = 1 : length(A(:,1))
    eee = find(A(jRow,:),1,'last');
    reorderRow(jRow) = eee;
end
toc

Result:

Elapsed time is 0.076666 seconds.
Elapsed time is 0.056438 seconds.
Elapsed time is 0.015933 seconds.

Ameer Hamza on 19 Mar 2020

Yes, I saw your solution. That's indeed brilliant.

Benson Gou on 19 Mar 2020

Dear Ameer,

Thanks a lot for your great help.

Benson

Sign in to comment.

Answer 5

Matt J on 19 Mar 2020

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix#answer_420999

Edited: Matt J on 19 Mar 2020

Open in MATLAB Online

[~,idx]=max(fliplr(logical(A)),[],2);
result=size(A,2)+1-idx;

6 Comments
Show 4 older commentsHide 4 older comments

Matt J on 20 Mar 2020

Edited: Matt J on 20 Mar 2020

You need to use flipud() so that "first" becomes "last", similar to how fliplr() did this in my solution above for row-wise search.

Benson Gou on 22 Mar 2020

Thanks, Matt. It is very helpful. I will try your idea.

Best regards,

Benson

Sign in to comment.

How to quickly find the column index of the last non-zero element in all rows in a sparse matrix?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

4 Comments
Show 2 older commentsHide 2 older comments

More Answers (4)

1 Comment
Show -1 older commentsHide -1 older comments

6 Comments
Show 4 older commentsHide 4 older comments

5 Comments
Show 3 older commentsHide 3 older comments

6 Comments
Show 4 older commentsHide 4 older comments

See Also

Categories

Tags

Community Treasure Hunt

How to quickly find the column index of the last non-zero element in all rows in a sparse matrix?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

4 Comments Show 2 older commentsHide 2 older comments

More Answers (4)

1 Comment Show -1 older commentsHide -1 older comments

6 Comments Show 4 older commentsHide 4 older comments

5 Comments Show 3 older commentsHide 3 older comments

6 Comments Show 4 older commentsHide 4 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

4 Comments
Show 2 older commentsHide 2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

6 Comments
Show 4 older commentsHide 4 older comments

5 Comments
Show 3 older commentsHide 3 older comments

6 Comments
Show 4 older commentsHide 4 older comments