Got Questions? Get Answers.
Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Sparse matrix indexing speed -- shape sensitivity

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Matt J

Date: 27 Aug, 2011 23:30:31

Message: 1 of 18

I know you're not supposed to build sparse matrices using assignment operations, but I don't think it explains the drastic shape sensitivity that I'm seeing below. Does anyone understand what's happening?


n=500;
rhs=rand(n);
idx=false(n);
  idx(1:2:2*n,1:2:2*n)=true;
linearIndex=find(idx);

A1=spalloc(1000,2000,500^2);
A2=reshape(A1,[],2);


tic
 A1(linearIndex)=rhs;

toc%Elapsed time is 0.053629 seconds.


tic
 A2(linearIndex)=rhs;

toc %Elapsed time is 22.884611 seconds.

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Bruno Luong

Date: 28 Aug, 2011 07:24:11

Message: 2 of 18

The time is even worst for

A3=reshape(spalloc(1000,2000,500^2),2,[]);

tic
 A3(linearIndex)=rhs;
toc % 354.819217 seconds.

It can only be account on the way TMW programs it, and who knows what they do.

All I know is my SETSPARSE function on FEX does much better than Matlab and time is consistent in three cases:

A4=A1;
A5=A2;
A6=A3;

tic
[i j] = ind2sub(size(A4),linearIndex);
A4=setsparse(A4,i,j,rhs);
toc % 0.045450 seconds.

tic
[i j] = ind2sub(size(A5),linearIndex);
A5=setsparse(A5,i,j,rhs);
toc % 0.037615 second

tic
[i j] = ind2sub(size(A6),linearIndex);
A6=setsparse(A6,i,j,rhs);
toc % 0.048041 seconds.

So there is no reason why it performs that bad with built-in assignment. This is one of the reasons I wrote this package. I'll delete it once they catch me.

Bruno

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Bruno Luong

Date: 28 Aug, 2011 08:12:10

Message: 3 of 18

For those who don't know, the SETSPARSE function is from:
http://www.mathworks.com/matlabcentral/fileexchange/23488-sparse-sub-access

Bruno

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Bruno Luong

Date: 28 Aug, 2011 10:03:08

Message: 4 of 18

I did some test and it looks like the sparse assigment command

S(linearIndex)=rhs;

will be treated sequentially as:

for i = 1:length(linearIndex)
    S(linearIndex(i)) = rhs(i);
end

This of course move internal data around a lot, it explains somewhat it's not efficient, but it does not explain why the shape of the matrix has huge effect. What I suspect is that the internal mechanism of reserving the data buffer does much more memory reallocation for tall/long matrices than with square one. But it's just pure speculation from me.

Bruno

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Matt J

Date: 28 Aug, 2011 13:18:10

Message: 5 of 18

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <j3cqer$o66$1@newscl01ah.mathworks.com>...
>
>
> tic
> [i j] = ind2sub(size(A6),linearIndex);
> A6=setsparse(A6,i,j,rhs);
> toc % 0.048041 seconds.
===================

Bruno, is there any way you could write modified versions of SETSPARSE/GETSPARSE which instead of requiring that you specify every (i,j) pair to be indexed allows you to use MATLAB's usual subscripting.

For example,

A6=setsparseMODIFIED(A6,1:10,12:18,b);

would be equivalent to

A6(1:10,12:18)=b;

Could you do this and do you think it would be competitive with MATLAB subscripting?

If you could provide this, then users would have a fast workaround for the above problems without having to go through the time and memory overhead of ind2sub.

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Bruno Luong

Date: 28 Aug, 2011 13:30:28

Message: 6 of 18

"Matt J" wrote in message <j3df6i$jnq$1@newscl01ah.mathworks.com>...
> "Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <j3cqer$o66$1@newscl01ah.mathworks.com>...
> >
> >
> > tic
> > [i j] = ind2sub(size(A6),linearIndex);
> > A6=setsparse(A6,i,j,rhs);
> > toc % 0.048041 seconds.
> ===================
>
> Bruno, is there any way you could write modified versions of SETSPARSE/GETSPARSE which instead of requiring that you specify every (i,j) pair to be indexed allows you to use MATLAB's usual subscripting.
>
> For example,
>
> A6=setsparseMODIFIED(A6,1:10,12:18,b);
>
> would be equivalent to
>
> A6(1:10,12:18)=b;
>
> Could you do this and do you think it would be competitive with MATLAB subscripting?

What is the intention for usage of it? SPARSE matrix is mostly used for FEM and graph theory. Such indexing is rarely used in both cases.

The only use I could think of is tensorial product, but as you know better than me, it's better not to build the final matrix at all.

Bruno

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Matt J

Date: 28 Aug, 2011 14:44:11

Message: 7 of 18

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <j3dftj$lkh$1@newscl01ah.mathworks.com>...
>
>
> What is the intention for usage of it? SPARSE matrix is mostly used for FEM and graph theory. Such indexing is rarely used in both cases.
================

I don't know about that. I've been using sparse matrices for quite sometime and I've never done any work with FEM or graph theory.

My general idea for having this was so that the subscript-indexing capabilities that sparse matrices are already supposed to have in MATLAB can be used properly.

A more specific problem for me, is that the shape sensitivity thing is creating issues for my ndSparse class. When I have an MxNxP ndSparse array it stores data internally as an MNxP ordinary sparse matrix, i.e., it can be very tall and thin, like A2 above and so indexing can be very slow.

In fact, if the modified setsparse/getsparse could accept N-dimensional indices that would be even better. Currently, I have to convert the first N-1 subscripts to linear indices, which is quite time/memory consuming.

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Bruno Luong

Date: 28 Aug, 2011 17:20:28

Message: 8 of 18

"Matt J" wrote in message <j3dk7r$3mp$1@newscl01ah.mathworks.com>...
>
>
> In fact, if the modified setsparse/getsparse could accept N-dimensional indices that would be even better. Currently, I have to convert the first N-1 subscripts to linear indices, which is quite time/memory consuming.

I don't know Matt.

1) SETSPARSE/GETSPARSE is designed for regular 2D array, it knows nothing about N-dimensional indexing of ndSparse, it shouldn't has any implementation in this way.

2) As I understand they way ndSparse combine the n-1 index to 1 linear index is a limitation. Are you able to store an ndSparse of dimension [1e6 1e6 1e6 10] (even an empty one)?

It looks to me a more serious implementation and n-dimensional sparse array should have its own real management of indexing using n-subscribed vector to avoid any overflow indexing. And to make it fast, probably a specific C-engine is needed.

Undertake such implementation is quite a big task. I'm not sure how large is the number of people who are truly interested in nd-sparse array and who intend to do any serious work with it. Can it justify the work for volunteer people like you and me?

Bruno

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Matt J

Date: 28 Aug, 2011 22:34:11

Message: 9 of 18

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <j3dtcs$rjq$1@newscl01ah.mathworks.com>...
>
> 2) As I understand they way ndSparse combine the n-1 index to 1 linear index is a limitation. Are you able to store an ndSparse of dimension [1e6 1e6 1e6 10] (even an empty one)?
=================

No, it's true, I cannot. But of course, MATLAB already has limitations similar to this presently on its 2D sparse matrices which user's must work around. Why should it be that this all-zero matrix stores perfectly fine

 A=sparse(1e8,1);

but not its transpose,

>> A.';
??? Error using ==> transpose
Out of memory. Type HELP MEMORY for your options.


Anyway, if extending the indexing supported by setsparse/getsparse is not easy to do, it's not worth the effort.

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Loren Shure

Date: 29 Aug, 2011 14:30:07

Message: 10 of 18


"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message
news:j3efp3$h5h$1@newscl01ah.mathworks.com...
> "Bruno Luong" <b.luong@fogale.findmycountry> wrote in message
> <j3dtcs$rjq$1@newscl01ah.mathworks.com>...
>>
>> 2) As I understand they way ndSparse combine the n-1 index to 1 linear
>> index is a limitation. Are you able to store an ndSparse of dimension
>> [1e6 1e6 1e6 10] (even an empty one)?
> =================
>
> No, it's true, I cannot. But of course, MATLAB already has limitations
> similar to this presently on its 2D sparse matrices which user's must work
> around. Why should it be that this all-zero matrix stores perfectly fine
>
> A=sparse(1e8,1);
>
> but not its transpose,
>
>>> A.';
> ??? Error using ==> transpose
> Out of memory. Type HELP MEMORY for your options.
>
>
> Anyway, if extending the indexing supported by setsparse/getsparse is not
> easy to do, it's not worth the effort.


In the User Guide discussing sparse, there is this statement:

For more information on how MATLAB stores sparse matrices, see John R.
Gilbert, Cleve Moler , and Robert Schreiber's Sparse Matrices In Matlab:
Design and Implementation, (SIAM Journal on Matrix Analysis and Applications
, 13:1, 333356 (1992)). (link:
http://www.mathworks.com/help/pdf_doc/otherdocs/simax.pdf)

--
Loren
http://blogs.mathworks.com/loren/
http://www.mathworks.com/matlabcentral/

Subject: Sparse matrix indexing speed -- shape sensitivity

From: James Tursa

Date: 29 Aug, 2011 14:56:09

Message: 11 of 18

"Matt J" wrote in message <j3efp3$h5h$1@newscl01ah.mathworks.com>...
>
> MATLAB already has limitations similar to this presently on its 2D sparse matrices which user's must work around. Why should it be that this all-zero matrix stores perfectly fine
>
> A=sparse(1e8,1);
>
> but not its transpose,
>
> >> A.';
> ??? Error using ==> transpose
> Out of memory. Type HELP MEMORY for your options.

Because for every sparse matrix, even an empty one, MATLAB allocates memory to hold an index array the size of (number of columns + 1). There will be a bunch of 0's in this index array for an empty matrix, but the memory still gets allocated.

James Tursa

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Matt J

Date: 29 Aug, 2011 15:10:47

Message: 12 of 18

"James Tursa" wrote in message <j3g9a9$rf1$1@newscl01ah.mathworks.com>...
>
>
> Because for every sparse matrix, even an empty one, MATLAB allocates memory to hold an index array the size of (number of columns + 1). There will be a bunch of 0's in this index array for an empty matrix, but the memory still gets allocated.
===================

Yes, James, I know. The point of the (rhetorical) question, though, was to counter-argue Bruno's point that ndSparse can't handle large sizes in all of its dimensions. While that's true, neither can normal MATLAB sparse matrices.

  

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Steven_Lord

Date: 29 Aug, 2011 15:24:53

Message: 13 of 18



"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message
news:j3bumn$9ts$1@newscl01ah.mathworks.com...
> I know you're not supposed to build sparse matrices using assignment
> operations, but I don't think it explains the drastic shape sensitivity
> that I'm seeing below. Does anyone understand what's happening?

I've informed the developers responsible for sparse matrix assignment about
this case so that they can investigate the cause for the difference in
performance.

--
Steve Lord
slord@mathworks.com
To contact Technical Support use the Contact Us link on
http://www.mathworks.com

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Bruno Luong

Date: 29 Aug, 2011 15:33:13

Message: 14 of 18

"Matt J" wrote in message <j3ga5n$194$1@newscl01ah.mathworks.com>...

> Yes, James, I know. The point of the (rhetorical) question, though, was to counter-argue Bruno's point that ndSparse can't handle large sizes in all of its dimensions. While that's true, neither can normal MATLAB sparse matrices.

But your argument does not hold to my book. I'm talking about the limitation due to overflow when switching from multi-indexes to linear index.

In Matlab sparse, each dimension can go up reasonably to 1e7, and it can goes to 1e9 if the RAM is big enough.

In the case of ndsparse, the product of n-1 first dimension must be reasonably small, even if the computer RAM is infinity, which is much more restrictive. This is flawed due to the software design rather than hardware limitation.

Bruno

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Matt J

Date: 29 Aug, 2011 19:07:27

Message: 15 of 18

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <j3gbfp$63s$1@newscl01ah.mathworks.com>...
>
> But your argument does not hold to my book. I'm talking about the limitation due to overflow when switching from multi-indexes to linear index.
>[snip]
> In the case of ndsparse, the product of n-1 first dimension must be reasonably small, even if the computer RAM is infinity, which is much more restrictive. This is flawed due to the software design rather than hardware limitation.
====================

That's all true, but again it's true of normal MATLAB sparse matrices as well. :-)

If you like, you can consider the purpose of ndSparse as something that allows you to start with a normal 2D MATLAB sparse matrix and then reshape it in N dimensions. All the same limitations on reshaping and indexing that are present in ndSparse are due to limitations already present in normal matrices.


> In Matlab sparse, each dimension can go up reasonably to 1e7, and it can goes to 1e9 if the RAM is big enough.
====================

So, yes, what you're effectively saying here is that you would have hoped for better from ndSparse and that's fine.

But I could equally well hope that the following normal sparse matrix, which is full of only zeros wouldn't take up 400 MB of storage, and that I could reshape it to my liking.

>> A=sparse(1e9,1e8);
>> A(:);
??? Matrix is too large to convert to linear index.

I could even call these things software design limitations (the 400MB surely is), but of course I know that there are trade-offs involved...

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Paul Fackler

Date: 11 Mar, 2013 22:23:05

Message: 16 of 18

"Matt J" wrote in message <j3go1f$j31$1@newscl01ah.mathworks.com>...
> "Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <j3gbfp$63s$1@newscl01ah.mathworks.com>...
> >
> > But your argument does not hold to my book. I'm talking about the limitation due to overflow when switching from multi-indexes to linear index.
> >[snip]
> > In the case of ndsparse, the product of n-1 first dimension must be reasonably small, even if the computer RAM is infinity, which is much more restrictive. This is flawed due to the software design rather than hardware limitation.
> ====================
>
> That's all true, but again it's true of normal MATLAB sparse matrices as well. :-)
>
> If you like, you can consider the purpose of ndSparse as something that allows you to start with a normal 2D MATLAB sparse matrix and then reshape it in N dimensions. All the same limitations on reshaping and indexing that are present in ndSparse are due to limitations already present in normal matrices.
>
>
> > In Matlab sparse, each dimension can go up reasonably to 1e7, and it can goes to 1e9 if the RAM is big enough.
> ====================
>
> So, yes, what you're effectively saying here is that you would have hoped for better from ndSparse and that's fine.
>
> But I could equally well hope that the following normal sparse matrix, which is full of only zeros wouldn't take up 400 MB of storage, and that I could reshape it to my liking.
>
> >> A=sparse(1e9,1e8);
> >> A(:);
> ??? Matrix is too large to convert to linear index.
>
> I could even call these things software design limitations (the 400MB surely is), but of course I know that there are trade-offs involved...


I know this is an old thread but I recently have been trying to develop some tools for handling sparse multidimensional arrays. In particular I am working on a MEX implementation to multiply two ND arrays and, optionally, to sum over specified dimensions. This has been done in the tprod function submitted to Matlab Central but not for the sparse case (as far as I can tell)

The explanation for why shape matters in accessing elements of a sparse matrix involves the storage scheme Matlab uses. It is easy to grab a column but finding a row in a given column requires that one do a search. The search is done in log(m) time where m is that number of elements in the column. This means that tall skinny matrices take much longer to access an element than a short wide matrix (given the same number of elements).

It is not clear to me that there is any way around this problem. All sparse storage schemes have their limitations. Issues of multidimensional storage and arithmetic have been around a long time and continue to arise. It is a shame that Mathworks has not developed a better way to address such a fundamental issue in applied computational work.

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Matt J

Date: 19 Mar, 2013 14:47:10

Message: 17 of 18

"Paul Fackler" <paulunderscorefackler@ncsudot.edu> wrote in message <khllg9$shn$1@newscl01ah.mathworks.com>...
>
> It is not clear to me that there is any way around this problem. All sparse storage schemes have their limitations. Issues of multidimensional storage and arithmetic have been around a long time and continue to arise. It is a shame that Mathworks has not developed a better way to address such a fundamental issue in applied computational work.
==============

Those are good insights, Paul, but why is it "a shame that Mathworks has not developed a better way", if "it is not clear that there is any way around this problem"? Aside from the fact, of course, that it leaves us without a solution? :-)

Subject: Sparse matrix indexing speed -- shape sensitivity

From: Paul Fackler

Date: 19 Mar, 2013 20:56:05

Message: 18 of 18

"Matt J" wrote in message <ki9tpe$o80$1@newscl01ah.mathworks.com>...
> "Paul Fackler" <paulunderscorefackler@ncsudot.edu> wrote in message <khllg9$shn$1@newscl01ah.mathworks.com>...
> >
> > It is not clear to me that there is any way around this problem. All sparse storage schemes have their limitations. Issues of multidimensional storage and arithmetic have been around a long time and continue to arise. It is a shame that Mathworks has not developed a better way to address such a fundamental issue in applied computational work.
> ==============
>
> Those are good insights, Paul, but why is it "a shame that Mathworks has not developed a better way", if "it is not clear that there is any way around this problem"? Aside from the fact, of course, that it leaves us without a solution? :-)

I guess I wasn't clear. There is no perfect solution to sparse storage problems. There are, however, good solutions to multidimensional arithmetic issues. Rather than forcing everyone to write their own (often buggy or limited) workarounds it would be useful to have built-in functions to handle basic arithmetic operators. The fact that Matlab continues to use the m-file version of kron that is basically what I wrote over 15 years ago suggests that they have not gotten serious about providing what I consider to be basic computational tools. The MEX file version of kron that I wrote soon after I wrote the M-file version has saved me many, many hours of waiting for results and allowed me to handle much larger problems. I just wish I didn't have to write my own MEX files for such things. In defense of the Mathworks, they have improved the speed of the language (I started with version 4) but
there is still no substitute for carefully coded basic operators provided as intrinsic functions.

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us