Creates an Ndimensional sparse array object, for arbitrary N.
This submission defines a class of Ndimensional sparse arrays for N possibly greater than 2. However, it should really be thought of as a way of starting with an ordinary MATLAB sparse matrix and reshaping it to have N dimensions. In other words, the sparse data must first be able to exist as an ordinary 2D MATLAB sparse matrix before being made Ndimensional. In fact, if the intended array has dimensions MxNxP...YxZ, then the class will store it internally as an ordinary 2D sparse array of dimensions (M*N*P*...*Y)xZ. This leads to certain memory strains when using large numbers of dimensions. I find the class useful mainly for moderate dimensional things like edge detection in 3D imaging, where you often want to hold a sparse 3D edge map.
USAGE:
S=ndSparse(X) where X is an ordinary MATLAB sparse matrix converts X into an ndSparse object. S can be reshaped into an Ndimensional sparse array using its RESHAPE method, for arbitrary N.
S=ndSparse(X,[M,N,P,...]) is equivalent to reshape(ndSparse(X),[M,N,P,...]).
The class also has a variety of static methods that can be used to construct instances of the class. For example,
S=ndSparse.build(Coordinates,Values,[m,n,p,...],nzmax)
lets you generate an Ndimensional sparse array from a table of explicit entries. This is a generalization to N dimensions of S=sparse(i,j,s,m,n,nzmax).
Other such methods include:
ndSparse.accumarray
ndSparse.sprand
ndSparse.sprandn
ndSparse.spalloc
EXAMPLES:
>> A=ndSparse.build( [1 1 1; 2 1 1;2 2 2] , [50,60 70]) %Builds a 2x2x2 sparse array from table
A(:,:,1) =
(1,1) 50
(2,1) 60
A(:,:,2) =
(2,2) 70
Many of the same manipulations common to ordinary multidimensional MATLAB full arrays can be performed on the sparse 3D array A generated above. It can be permuted, summed, concatentated, and so forth e.g.,
>> B=sum( permute([A,A+10], [3,2,1]) ,2)
B(:,:,1) =
(1,1) 120
(2,1) 20
B(:,:,2) =
(1,1) 140
(2,1) 160
Other overloaded methods include BSXFUN, REPMAT, CIRCSHIFT, CONVN, FLIPDIMS, SQUEEZE, SHIFTDIM and many more. Type "methods ndSparse" for a full list and use "help ndSparse.methodname" to get details of usage.
When browsing the list of methods, note that certain common operations have different implementations, optimized for different situations. Specifically, SUM, ANY,ALL, MIN, MAX... have alternative implementations SUMML, ANYML, ALLML, MINML, MAXML which are optimized for "lowdimensional" ndSparse objects OBJ. Here, lowdimensional means that a normal Ncolumn MATLAB sparse matrix won't consume too much memory on your platform for N=MAX(NUMEL(OBJ)./SIZE(OBJ)).
Another feature of the class is that bioperand operations are allowed between ndSparse objects and MATLAB objects of any numeric type (single, uint16, etc...). This is not true of ordinary MATLAB sparse matrices, as of R2010b.
>> C=eye(2,'single')*B(:,:,2)
C =
(1,1) 140
(2,1) 160
>> whos A B C
Name Size Bytes Class Attributes
A 2x2x2 136 ndSparse
B 2x1x2 140 ndSparse
C 2x1 104 ndSparse
To convert back to an ordinary nD full array, use the class' overloaded FULL method. To convert to a normal 2D sparse matrix, use the methods SPARSE or SPARSE2D. For example, SPARSE2D will convert an MxNxPx...xQ ndSparse array to the two dimensional (M*N*P*...)xQ sparse matrix in native MATLAB form.
1.19  Indexing bug fix. Affected scalar, linear indexing (see update notes). 

1.18  Indexing bug fixes. 

1.16  Improvements to MIN/MAX when dealing with singleton dimensions. Improved memory efficiency in CAT and null assignment. 

1.15  Fixed bug involving ndSparse arrays with singleton dimensions. It affected MIN, MAX, SUBSASGN, and CAT. 

1.14  Speed up indexing operations of the form A(:,:,...,:,i) 

1.13  Fixed the output of ndSparse.build and CONVN, which sometimes had very large nzmax. 

1.12  Improved the DISPLAY method, so that now only A(:,:,i,j,k,...) with nonzero data will be displayed. 

1.8  *Forgot to add installation instructions in my submission earlier this week


1.7  *Some bug fixes, including Ohad's issue of assigning a matrix into an array. *Improved memory efficiency of various operations, but with tradeoffs. See UpdateNotes 

1.6  Fixed a bug in SUBSASGN operations that expand the size of the matrix,


1.5  Pervasive changes, some new methods, and some bug fixes. See UpdateNotes in .zip for full details. Among other things, ndSparse objects should typically consume a lot less memory now. 

1.4  Fixed a bug in the constructor, which could cause ndSparse object to have incorrect dimensions. 

1.3  See UpdateNotes in the .zip for full details. Highlights: 1. Lots of new methods!! BSXFUN, REPMAT, CIRCSHIFT,CONVN,...


1.2  See UpdateNotes in the .zip for full details. Highlights: 1. Lots of new methods!! For example BSXFUN, REPMAT, CIRCSHIFT,CONVN. 2. Improved the efficiency of PERMUTE 3. Some small backwardcompatibility issues. 

1.1  *Some improvements to CAT and SUM, including minor bug fixes.

Inspired by: sub2allind
Yaroslav (view profile)
Robert Alvarez (view profile)
Did you derive the algorithm from this principle or is there a reference somewhere that gives a more complete explanation?
Matt J (view profile)
@Robert,
It's a very basic approach. Essentially, the convolution routine treats the ndSparse array as a sum of impulse functions and the convolution sifting property is used.
Robert Alvarez (view profile)
Could you explain or provide a reference for the algorithm used in the convolution method convn?
Matt J (view profile)
Hi Mario,
Thanks for your feedback.
I'm not sure which previous comment you are referring to, but the error you're getting isn't due to memory limitations. There are limits that must be imposed on the size of a matrix A(i,j) because there are limits on the range of indices i,j that you can generate with 64bit double floats. MATLAB may have other criteria for limiting the size as well. In any case, the error you are seeing is coming from MATLAB (not ndSparse specifically) and happens for ordinary sparse matrices as well, e.g.,
>> A=sparse(5000^4,1);
Error using sparse
Sparse matrix sizes must be nonnegative integers less than MAXSIZE as defined by COMPUTER. Use HELP
COMPUTER for more details.
Mario (view profile)
I just saw that another use several years ago had a similar question which you answered that it's not possible because if too small RAM. I wonder why this is because ndSparse is actually not creating those large arrays, right? At least when i do
SmallerArray=ndSparse.build([5000, 5000, 5000, 5000], [0])
there is no excessivly large amount of memory used (even though it would have ~10^14 entries). Thank you!
Mario (view profile)
This is a really wonderful tool, thank you! I have a question: When I try to create very large arrays, such as
LargeArray=ndSparse.build([5000, 5000, 5000, 5000, 5000], [0])
MATLAB shows me the following error:
??? Error using ==> sparse
Sparse matrix sizes must be nonnegative integers less than MAXSIZE as defined by COMPUTER. Use
HELP COMPUTER for more details.
Are you aware of a simple fix for this problem?
Pavel Vostatek (view profile)
Matt J (view profile)
Hi Niko,
A little more on this. It looks like the problematic line is at the very end of padarray_algo
if islogical(a)
b = logical(b);
end
If "b" is ndSparse, then logical(b) will convert it to an ordinary logical sparse array.
Matt J (view profile)
Hi Niko,
I can't be sure, but there is no particular behavior you should expect from padarray for ndSparse. padarray is not a core MATLAB function and I did not overload it for the class, as you will see by doing
>> methods ndSparse
Pretty amazing that it didn't throw and error.
Niko K (view profile)
Hi. Are you still developing this code? Look at this:
a = logical(zeros(66,66,26));
b = padarray(a,[1,1,1]);
c = ndSparse(a(:),[66,66,26]);
d = padarray(c,[1,1,1]);
whos
Name Size Bytes Class Attributes
a 66x66x26 113256 logical
b 68x68x28 129472 logical
c 66x66x26 353 ndSparse
d 4624x28 241 logical sparse
What's going on there with matrix 'd'?
Tom (view profile)
Caglayan Dicle (view profile)
Matt J (view profile)
@Lauren,
Do you still have get that error? What version of MATLAB are you using? And what version of ndSparse?
You seem to be the only one with this problem, I'm afraid. It doesn't readily reproduce on my machine and I haven't received similar complaints.
Lauren (view profile)
Hi Matt
Not sure why my previous comment no longer appears, but I had mentioned that I get an error message when attempting to run ndstest  the error message being something about the inputs to kron being of the wrong type, coming from line 882 of ndtest.
Thanks!
Matt J (view profile)
Lauren: not clear what you meant by your "comment below". As far as I can tell, you posted no prior comments.
Lauren (view profile)
I realized that (for my comment below) the problem is that one of the inputs being sent to kron is ndSparse, which appears to not be ok. (The error comes up when checking convn in line 882 of ndstest).
Wei (view profile)
OK. I will try to find other ways to solve this issue. Thanks Matt.
Matt J (view profile)
Hi Wei. At minimum, you would probably need to upgrade to a 64bit OS with more RAM. Your array is pretty big, though. It might be beyond the scope of what ndSparse can handle :(
Wei (view profile)
Hi Matt, when I ran command of A=ndSparse.build([1e6,1e6,1e6]), I got message as following:
??? Error using ==> sparse
Sparse matrix sizes must be nonnegative integers less than MAXSIZE as defined
by COMPUTER. Use HELP COMPUTER for more details.
Error in ==> ndSparse>ndSparse.build at 2282
data=sparse(M,N);
My computer is 32 bit windowsXP OS. Matlab version is 2010b. How do I solve this issue? Thanks.
Matt J (view profile)
@Racquel, I've fixed the indexing bug you reported (and one or two others). Thanks for letting me know.
Raquel A (view profile)
Thanks for the quick response, yes, I am aware I should be getting that, I just posted it as an example that somehow, the combination (58,58,...) gave an error but other ones that were close to 58 didn't.
Thank you for looking into it.
Matt J (view profile)
Hi Raquel,
When you do
X(57,57,1) = 1
you should be getting
X(:,:,1) =
(57,57) 1
This is what I get, so I hope what you posted was some sort of typo. As for the case of X(58,58,1) = 1, I will have to investigate that. Thanks for reporting it!
Raquel A (view profile)
Hi Matt, I have an issue when I run your code (thanks for submitting it though it's great)
Ok so for example I create a sparse matrix:
X=ndSparse.build([59,59,10])
somehow, there is an issue if I try to input in X(58,58,1). Every time I do this or any other number in the third dimension, it just fills up the full two first dimensions.
for example:
X(57,57,1) = 1
just gives me
X(:,:,1) =
(58,57) 1
BUT, if I try to do the same in
X(58,58,1) =1
I get
val(:,:,1) =
(1,1) 1
(2,1) 1
(3,1) 1
(4,1) 1
(5,1) 1
(6,1) 1
(7,1) 1
(8,1) 1
(9,1) 1
(10,1) 1
(11,1) 1
(12,1) 1
(13,1) 1
(14,1) 1
(15,1) 1
(16,1) 1
(17,1) 1
(18,1) 1
(19,1) 1
(20,1) 1
(21,1) 1
(22,1) 1
etc...
How can i fix this?
Matt J (view profile)
Hi Josh,
Please give me a code sample (a short one, ideally) that let's me reproduce this.
Josh (view profile)
Bug report: I have all zero ndSparse matrices that return a column vector for their size(). This causes all sorts of problems (cant' cat() them with otherwise samedimensioned matrices, can't use full() on them...).
Matt J (view profile)
Hi Yoni,
Make sure that the 1st column of your PCoords data must range between 1 and 5
Similarly, the 2nd and 3rd columns of your PCoords data must range between 1 and 50.
The 4th and 5th column of your PCoords data must range between 1 and 100.
Yoni Browning (view profile)
Thanks so much for writing this!
I do just have one (hopefully) quick question: I would like to use your class to process a 5dimensional dataset, lets call it 'P', such that size(P) = [5,50,50,100,100]. As per the recommendations for other commenters, I have been storing the coordinates of each nonzero point in P as the rows of one matrix (PCoords) and the values that should appear at each point in a sperate vector (PVals). Only at the end of my computation do I attempt to use these in a ndSparse object, which I have tried building both of these ways way:
P = ndSparse.build(PCoords,Pvals).
and
P = ndSparse.build(PCoords,Pvals,size(P),length(Pvals)).
Both of these causes the error:
Error using sub2ind (line 56)
Out of range subscript.
Error in ndSparse>ndCoords2IJ (line 3070)
II=sub2ind(ndShape(1:end1),ndCoords{1:end1});
Error in ndSparse.build (line 2337)
[I,J,dims]=ndCoords2IJ(explicitCoords,ndShape);
Yes, I have checked, and PCoords and Pvals are the same length, dimensions, etc.
I am really sorry if I am missing something obvious, but I am having a hard time figuring out where I am going wrong from looking at your code, and I would really appreciate any direction you could give.
Thanks so much!
Matt J (view profile)
Hi Mehdi,
The problem is that your posts here are being received not just by me, but also by anyone who might be watchlisting this FEX file. This is SPAM to them. They expect to be receiving news from here only about ndSparse. I've moved your question to this Newsgroup thread:
http://www.mathworks.com/matlabcentral/newsreader/view_thread/321265
Mehdi (view profile)
Hi Matt,
Thanks a lot for your quick reply and sorry about my lack of knowledge in this area. Since your area of expertise is image processing I thought you are the right person to ask. I need to create a 3D image dataset in which certain voxels (whose x,y,z values are my indices) will have a nonzero value.
I don't even mind if they are being assigned any value from 1 to 656535.
Because I can then segment them to two phases to get a 2phase system.
Could you advise on what I can do to achieve this and get a 3D binary dataset made this way.
Matt J (view profile)
Hi Mehdi,
Since this no longer concerns ndSparse, you should probably post your issue in the Newsgroup. However, matrices and arrays are not functions. They can only be indexed by integers>=1. So, it is unclear what you mean when you say you want 1s assigned to noninteger indices.
Mehdi (view profile)
Hi Matt,
So many thanks for your constructive comments. It all worked fine and it indeed had the issues as you mentioned.
However, I am now facing a new issue. In a similar problem, I have to assign 1's to a 3D matrix with the x,y,z values of the coordinates of the points being singleprecision floating numbers rather than intergers. Then span the same range of 0<=x<=1000 0<=y<=1000 0<=z<=1800 though.
When I tried the suggestions above it simply gives an error like:
??? Attempted to access b(39.372,36.681,44.517); index must be a positive
integer or logical.
Error in ==> matrix_3D at 24
b(a(i,c2),a(i,c1),a(i,c)) = 1;
Could you please advise any solution to this problem? I much appreciate your time.
Matt J (view profile)
Hi Mehdi,
I can tell you how to do this with ndSparse, but it sounds like your problem has simpler solutions.
First, you cannot use x,y,z coordinates that have value 0 since MATLAB indexing is 1based. Doing so should have given you an error in your code. I'll assume henceforth that your mesh_data are really 1based coordinates.
Second, since you say you have a powerful enough machine to hold a full double array of dimensions 1000x1000x1800 in MATLAB memory, it's not clear why simply changing the values from 0 to 1 pushes your memory limits over the edge. Changing the values of a nonsparse array doesn't even consume any extra memory. What's probably happenings is that your mesh_data are not really bounded to 1000x1000x1800 as you think, in which case MATLAB will try to grow the matrix to accomodate the new data. Growing the matrix would require extra memory and is also very speed inefficient. The solution, of course, would be to remove outofbounds coordinates from mesh_data or else preallocate a larger b to hold the coordinates you have.
Third, since your data consists only of 0s and 1s, you should probably store b as uint8, which would reduce the memory your code above is consuming by a factor of 8. You might want to see what happens if you make the following simple change to your code
b=zeros(1000,1000,1800,'uint8');
In any case, to create the array you are talking about using ndSparse, you could do the following
b=full(ndSparse.build(mesh_data,1,[1000,1000,1800]));
If you have outofbounds mesh_data however this will not work
Mehdi (view profile)
Hi Matt, I am trying to create a 3D array of size 1000x1000x1800 with all the elements being zero and then assign 1 instead of 0 to some specific elements in the array and finally store it as a binary file.
Although MATLAB creates and saves the 3D array quickly, while trying to change zero's with one's , it runs out of memory.
((The 4,470,000x3 array named "mesh_data" is the coordinates (x,y,z) of the points that I need them to be assigned the value "1" in the 3D array.))
I should mention that all the x,y,z values of the coordinate matrix are nonnegative integer numbers with: 0<=x<=1000 0<=y<=1000 0<=z<=1800
My ultimate aim is to export the 3D matrix (after all th esubstitutions), in a binary format (other than MATLAB's default binary format) so that I can use it with other programs.
Could you guide how I can use your program for this purpose?
Any help is much appreciated.
%================================
load mesh_data;
b=zeros(1000,1000,1800);
save('3d_matrix', 'b','v7.3');
a=mesh_data;
[r c]=size(a);
for i = 1:r,
b(a(i,c2),a(i,c1),a(i,c)) = 1;
end
save('mesh_matrix', 'b','v7.3');
%================================
Matt J (view profile)
Josh  you mean you were doing an operation involving an ndSparse array and a full array of uint8s? And you were hoping to get a full array of uint8s as output? If so, I see what you mean. I'm considering redesigning the class so that it stores the nonzero values in tabular form, instead of as a MATLAB 2D sparse matrix. If I do, it will remove the need for mkCompat and you would get uint8 output or anything else you wanted. Until then, though, defaulting to doubles is the only way. A warning might not be a bad idea, though.
Josh (view profile)
In mkCompat, perhaps it would be worth doing some extra checks before defaulting to doubles, or at least a warning? Memory usage can go crazy at that spot. For various reasons, I had a huge array of uint8s. But it was only 1s and 0s. Converting that to a double made my memory usage spike to 8+GB in just a second or two. Let the user know that it was because they gave suboptimal inputs, not because your code is terribly inefficient.
Wei (view profile)
@Matt:
Thanks for the quick fix. Looking forward to your new solution.
Matt J (view profile)
@Wei: I've fixed the problem, but now I think I have a better solution. I'll post it over the next few days.
Wei (view profile)
found bug:
cat(2, ndSparse(rand(10,1,3)), ndSparse(rand(10,2,3)));
should be sth to do with the untrail in the permute ...
Akram Bin Sediq (view profile)
Thank you so much. This class should be added to MATLAB soon!
Achim (view profile)
Qiao Liyong (view profile)
Thank you for your program.
Matt J (view profile)
@Siavash:
Just as an FYI, the update a just submitted should make the kind of loops over slices G(:,:,i) go significantly faster. Here's a simple test I did,
>> G=ndSparse.build([882,882,82]); Z=sprand(882,882,.001);
>> tic; for ii=1:82, G(:,:,ii)=Z; end; toc,
Elapsed time is 0.149033 seconds.
J.D. (view profile)
This works great. I love that so many functions are overloaded.
Matt J (view profile)
Hi Siavash,
If I've understood you, it sounds a lot like the issue in the comment posted above by Mircea (Aug. 27, 2011). If the application seems to hang, it probably hasn't stopped or crashed, but is probably taking a long time because of a sparse indexing bug in MATLAB (see my response Aug. 28, 2011).
The one thing that sounds strange in what you describe is that you say your loop runs fine up until the final iteration, kk_y. Is that correct? And if so, how do you know the earlier iterations run fine? I'll probably need to see your code to have the best picture of what's going on. You can send it through my author email link, if you like.
Beyond that, though, I want to repeat to you, what I told Mircea. Assigning into a sparse matrix (whether it be a normal MATLAB sparse matrix or an ndSparse object) is a bad practice and best kept to a minimum. Instead of loading data into the array slicebyslice, you should generate a complete table of values in advance and then use it to build the whole 882x882x82 array in one single call to ndSparse.build. This is the approach I showed you in my comment posted yesterday.
Siavash (view profile)
Thanks for your prompt reply.
Actually i have a loop and in that loop I construct a 882by882 sparse matrix kk_y(82) times. and I want to store them in a 882*882*82 sparse matrix. but when it reaches to G(:,:,kk_y) = J_g_hat_kk;
it crashes! (no error, nothing, but only stop everything!).
Matt J (view profile)
@Siavash  your comment is a bit hard to interpret. You haven't said what kk_y is. Also, you say you're trying to make a "882 x 882 three dimensional sparse matrix", which seems selfcontradictory since 882x882 would suggest 2D, not 3D.
However, if I had to guess, I would say that you're looking for this:
[ii,jj,ss]=find(J_g_hat_kk);
kk=repmat(kk_y,size(jj));
G = ndSparse.build([ii,jj,kk],ss,[n_st n_st,kk_y]);
Siavash (view profile)
Hi,
I tried to use your code to make a 882 by 882 three dimensional sparse matrix but it did not work and the computer was crashed. Simply this is what I do:
G = ndSparse.build([n_st n_st]);
G(:,:,kk_y) = J_g_hat_kk;
J_g_hat_kk is a 882by882 sparse matrix. and n_st is 882. Could you please help me in this respect,
Thanks a lot
Siavash
Siavash (view profile)
Hi,
I tried to use your could to make a 882 by 882 three dimensional sparse matrix but it did not work and the computer was crashed. Simply this is what I do:
G = ndSparse.build([n_st n_st]);
G(:,:,kk_y) = J_g_hat_kk;
J_g_hat_kk is a 882by882 sparse matrix. and n_st is 882. Could you please help me in this respect,
Thanks a lot
Siavash
Matt J (view profile)
@Niklaus  sorry I didn't notice until now that this is happening for you in ndstest. Please post the entire error message so that I can know where in ndstest it occurs.
Matt J (view profile)
Hi Niklaus  the error you describe is unrelated to how new your MATLAB version is. The code is complaining that the reshape operation you're attempting is invalid because it doesn't preserve the number of elements in the array. Like trying to reshape a 2x5 array into a 3x3 array, for example.
Niklaus (view profile)
Hey Matt!
Sounds like a really neat code! Unfortunately I have an old version of MATLAB (R2009a). When I run ndstest, i get the following error message:
??? Error using ==> ndSparse>ndSparse.ndSparse at 126
The number of DATA elements is not consistent with the given ndShape.
Do you know why this happens? Is there any way for me to fix this? Thank you very much!
Nik
Matt J (view profile)
@Jens:
Incidentally, if you are now happy with the reimplementation I've made of the DISPLAY method, I would appreciate it if you would change your rating. One star out of five is, in any case, a harsh rating just because you don't like 1 out of the 92 methods offered by the class.
Matt J (view profile)
@Jens:
You talked me into it. I've now updated the distribution so that DISPLAY will print only those A(:,:,i,j,k,...) containing nonzero data.
For displaying purposes, at least, this should solve your problem. Mimicing the data you described, I'm now able to generate the following 37x37x37x37x37x37x37 data set on a 64bit platform and get a reasonably instantaneous display:
A=ndSparse.sprand(37*ones(1,7),400/37^7)
However, I am vaguely concerned that you are pushing the limits of the class by having such a large array volume, even on a 64bit platform. The class wasn't designed to handle arbitrarily large numel(A), just as ordinary MATLAB 2D sparse matrices were not.
I now clarify this a bit more in the description section of the FEX page.
Matt J (view profile)
@Jens: The reason I don't store my own tables is because then I couldn't piggy back on the fast sparse arithmetic routines of MATLAB's underlying 2D sparse class. I would have to implement my own.
In any case, the problem with displaying that you mention is probably not strictly related to data storage. It is related to the fact that the ndSparse DISPLAY method mimics the display method for full arrays and tries to print all "sheets" of the form A(:,:,i,j,k,...) to the screen. Since you have 37^5 sheets, naturally this is going to take some time to disaply to the screen.
Upon reflection, I admit that a sparser display would be more appropriate for an ndSparse class. For example, I could modify the DISPLAY method so that it only displays nonzero sheets.
On the other hand, as a designer, I can't really anticipate how every enduser would like their data displayed. Would a display of 2D sheets
A(:,:,i,j,k,...) offer meaningful information to someone working in 7dimensional space, like yourself, even if I trimmed it down to the nonzero ones? And would every enduser prefer this?
Also, if you only want to display the nonzero data, why not use FIND? Or create your own wrapper for FIND which displays the data the way you want?
Jens Munk Hansen (view profile)
Hey Matt
Nice idea, but bad implementation. I am working with a parameter study and data is extremely sparse, dimensions are 37x37x37x37x37x37x37, but I have less than 400 nonzero elements. The display function takes more than 10 hours to execute.
Why not simple store tubles and use an intelligent search?
Alexander Stepanov (view profile)
Illustration of my issue with sensitivity:
profile on
N=1e5; % data points
data=rand(N,1); % random data
dims=[100,100,100,200000];
ind=NaN(N,4); % random indices
ind(:,1)=randi(dims(1),N,1);
ind(:,2)=randi(dims(2),N,1);
ind(:,3)=randi(dims(3),N,1);
ind(:,4)=randi(dims(4),N,1);
% (100x100x100) x 200000
A1=ndSparse.build( ind , data, dims);
% (100x200000x100) x 100
A2=ndSparse.build( circshift(ind,[0 2]) , data, circshift(dims,[0 2]));
size(A1)
size(A2)
tic;
ans1=A1(1,:,:,:); % takes about 0.09 secs
toc;
ans2=A2(:,:,1,:); % takes about 115 secs !!!!
toc;
% results are equal w.r.t. dimension shift
isequal(ans1,shiftdim(ans2,2))
profile viewer
% 99% of time is spent on a line data=builtin('subsref',obj.data,E);
Matt J (view profile)
This sounds different than Mircea's issue. If you can post a short example illustrating the problem, then I can possibly analyze it further
Alexander Stepanov (view profile)
Ok, hope they'll fix this issue. For now I'll find a workaround.
Btw, for me the problem was not in assigning. I only create this matrix once (to store a very huge dataset), and then slice different data from it when I need. Sensitivity issue cames up on a line 478:
data=builtin('subsref',obj.data,E);
Anyway, thanks again, one of the best submissions I've seen on the exchange)
Matt J (view profile)
Hi Alexander, thanks for your feedback. I admit the shape sensitivity thing is a drag. However, the feature you describe would be very difficult to implement and require very pervasive changes throughout the code. I would hope that TMW will fix the shape sensitivity soon (it's already been reported) so that the issue will become moot.
I want to reiterate to you, though, what I said to Mircea. Assigning into a sparse matrix (or ndSparse object) is always best kept to a minimum, even without the shape sensitivity problem.
Alexander Stepanov (view profile)
Hi, this class is just wonderful. However, if you consider making another release, there's one more feature it lacks.
I think you should give user ability to decide how to group matrices to store as 2D sparse. E.g. I need a 4D matrix MxNxPxQ, but it turns out that due to shape sensitivity discussed above, the best best way for me to store it is (MxN)x(PxQ), otherwise performance decrease dramatically. Unfortunately, this currently can't be done in your class.
Thanks for such a great submission and looking forward to your reply
Mircea (view profile)
Thank you again, Matt J. It is clear I was going the wrong way working with sparse matrices (this is the first time I need it to use them). Thank you for your help.
Matt J (view profile)
@Mircea, the problem you're seeing is rooted in normal 2D MATLAB sparse matrices. It is related to the shape sensitivity issue being discussed in this thread:
http://www.mathworks.com/matlabcentral/newsreader/view_thread/311992
There's not much I can do about it for now, but regardless, building sparse arrays the way you are doing is a bad idea, even for normal MATLAB sparse matrices. The efficient way is to create a complete table of nonzero entries first, and then construct the matrix using SPARSE or using ndSparse.build.
It is also a bad idea to use matrix assignment to alter the pattern of nonzeros in a sparse matrix, as illustrated by Tim Davis here:
http://www.mathworks.com/matlabcentral/newsreader/view_thread/241852#620175
Mircea (view profile)
Thank you, Matt J, this class is very usefull.
I have the following question: why does the following code
n = 500;
A = ndSparse.build([2*n 2*n]);
A(1:2:2*n,1:2:2*n) = rand(n);
run in 0.04 seconds, while the following similar code
n = 500;
A = ndSparse.build([2*n 2*n 2]);
A(1:2:2*n,1:2:2*n,1) = rand(n);
runs in 43 seconds (thats 1000 times slower!).
Is there a way to speed up the process of assigning values to truely Ndimensional sparse arrays (N>2)? Thank you.
Matt J (view profile)
@Ohad, I think that to fix it, you just need need to change the following line
data=builtin('subsasgn',data,E,rhs);
to
data=builtin('subsasgn',data,E,rhs(:));
I'll do another release once I've had time to examine this a little further.
Ohad (view profile)
I do not seem to be able to assign matrices into arrays. e.g.:
A=ndSparse.spalloc([3,3,2],9)
A(:,:,1) = sparse(ones(3))
any ideas?
pink (view profile)
how do I enter "ndSparse" into the program matlab
I use matlab2010a
Matt J (view profile)
@Mike,
My latest release includes the change of representation that we were talking about on Feb 10, 2011. That is, the data is now stored internally as a sparse matrix of dimensions (L_1*L_2*L_3...)xL_N. The memory savings should be quite noticeable.
Since the recoding was substantial, please backup your old version and give me feedback on any bugs that I may have introduced. Hopefully, ndstest was a pretty good filter, but you never know...
Maigo (view profile)
Thanks Matt :)
Matt J (view profile)
Maigo, thanks for your feedback. Your problem is indeed versionrelated. See my similar comment above to Farhad on 31 Jan 2011.
However, in the .zip file you should have found a version compatible with R2009b (or later). It contains isrow and iscolumn implementations and should have been fine for you. See also my comment on 04 Feb 2011.
Maigo (view profile)
Thank you very much. This class helped me a lot!
But I've found a bug with the "subsref" method, which calls two functions "isrow" and "iscolumn" that are not defined.
I am using Matlab 7.10.0 (R2010a), so I don't know if this is a versionspecific problem. Maybe these two function are defined in 7.11.
Anyway, I think they're pretty easy to implement, and here's the code. Just add them to end of the class file:
function bool = isrow(A)
% isrow  returns true if A is a row vector
dims = size(A);
bool = (length(dims) == 2) && (dims(1) == 1);
end
function bool = iscolumn(A)
%iscolumn  returns true if A is a column vector
dims = size(A);
bool = (length(dims) == 2) && (dims(2) == 1);
end
Bruno Luong (view profile)
Check out this discussion about memory used by sparse matrix:
http://www.mathkb.com/Uwe/Forum.aspx/matlab/147951/SparseMatrixSizegetsreduceswhenwetaketransposeofthis
Matt J (view profile)
Mike, the explanation that I gave you was imprecise. See Tim Davis' explanation here,
http://www.mathworks.com/matlabcentral/newsreader/view_thread/235374#598253
He also gives links to expanded discussions on the subject.
Mike (view profile)
Matt: no problem. out of curiosity, how were you able to find out that matlab stores the # of elmnts for each column? on wiki it claims that matlab sparse uses COO format, which includes only the coordinates of each nonzero entry (so this is clearly a lie!). spinoffs of the Yale format seem very popular and do depend on the length of one of the dimensions and so seem a likely candidate, but they store something different from number of entries/column or row. when i go to sparse.mat i see only the help file, and the help file says nothing about what's going on behind the scenes, so just curious how you got to the bottom of it.
Matt J (view profile)
Mike: "But given that the size of matlab sparse matrices is proportional to the length of the 2nd dimension, I would vote strongly in favor of going to the (L_1*L_2*L_3...)xL_N sparse matrix representation..."
Yes, I've been thinking in that direction myself for a few weeks now. Unfortunately, it will require pervasive changes to the code, and will probably take me a few weeks at least :(
Mike (view profile)
Matt, thanks, that clarifies things. i still have a hard time thinking through how number of elements in each column helps on things like matrix multiplication, but perhaps i'm not being imaginative enough.
But given that the size of matlab sparse matrices is proportional to the length of the 2nd dimension, I would vote strongly in favor of going to the (L_1*L_2*L_3...)xL_N sparse matrix representation you mention unless there's a substantial slowdown in basic linear algebra. In the case of very large, very sparse, and very high dimensional matrices (of interest to me), the difference in storage space will be enormous.
Matt J (view profile)
"but you're saying the amount of memory of sparse matrices in both cases in linearly proportional to the total size of the matrix being represented??"
The main thing to understand is that every normal MxN sparse matrix internally stores a table of the number of nonzero elements in each column. In other words, it will always contain at least the same amount of data as in rand(1,N).
That's why the following two sparse vectors, even though they both contain all zeros, have very different memory consumptions.
>> A=sparse(zeros(1,1e6)); At=A.';
>> whos A At
Name Size Bytes Class Attributes
A 1x1000000 4000016 double sparse
At 1000000x1 20 double sparse
Pathological? Well, keep in mind that MATLAB sparse matrices were intended mainly for doing fast linear algebra. Apparently, these operations are faster if you know a priori the number of nonzeros in each column, instead of having to recompute them all the time onthefly.
Mike (view profile)
I see.. but then I'm confused why the memory consumed by normal sparse matrices doesn't depend entirely on the number of nonzero elements, as a sparse matrix is completely characterized by the coordinates and values of each nonzero element (plus,e.g., a pair of integers specifying the overall size in the 2D case). but you're saying the amount of memory of sparse matrices in both cases in linearly proportional to the total size of the matrix being represented?? if so, i think this is the far more pathological problem with matlab sparse matrices than them being limited to 2 dimensions!
Matt J (view profile)
Mike, just as a followup point, one thing that would reduce memory consumption is if I store the data internally as an (M*N)xP sparse matrix as opposed to a Mx(N*P) matrix.
Since MATLAB sparse matrices with fewer columns consume less memory, this would indeed help with memory consumption. I've been considering this for future releases, but I need to see how pervasive the effect would be.
Matt J (view profile)
Mike, the amount of memory consumed by an MxNxP ndSparse array A, will essenitally be the same as the ordinary 2D MATLAB sparse matrix of dimensions
Mx(N*P) generated by doing
sparse(A).
You can see this better in a larger version of your example,
A=ndSparse([]);
A(20,40,60)=4;
As=sparse(A);
>>whos A As
Name Size Bytes Class Attributes
A 20x40x60 9696 ndSparse
As 20x2400 9616 double sparse
In fact, the way it works is that the object A is storing its data internally as the 2D matrix As. There is a difference of about 80 bytes due to other minor ndSparse class data.
So the answer to your question is, there is no memory inefficiency in ndSparse that isn't in normal MATLAB matrices. Since mine is built on top of theirs, there's not much I can do about it.
One thing you should be aware of, though, is that the memory consumption of sparse matrices, both mine and MATLAB's, is shapesensitive. Permuting the dimensions can significantly reduce memory consumption, e.g.,
>> Ap=permute(A,[3 2 1]); whos Ap
Name Size Bytes Class Attributes
Ap 60x40x20 3296 ndSparse
Mike (view profile)
if i do "a = ndSparse([])" and then "a(1,2,3)=4", i get back:
"a(:,:,1) =
All zero sparse: 1by2
a(:,:,2) =
All zero sparse: 1by2
a(:,:,3) =
(1,2) 5"
each of these "all zero sparse" matrices take up memory and don't seem worth keeping track of. is it possible to eliminate this behavior?
thanks.
Matt J (view profile)
@Farhad, I've now included a version of ndSparse that is compatible with R2009b. See the UpdateNotes.txt file for details.
Matt J (view profile)
@Farhad  You don't have a recent enough version of MATLAB, and so are missing some of its newer functions. You should upgrade to R2010b. Alternatively, you could write your own iscolumn.m
function bool=iscolumn(v)
bool=isequal(size(v), size(v:));
I can't be sure that that will be the only missing function, but I'm pretty optimistic...
Farhad M (view profile)
Hi, thanks! it must be very much useful for my purpose. However, I couldn't run ndstest since it gives the following error:
??? Undefined function or method 'iscolumn' for input arguments of type 'ndSparse'.
Could you help me with that?
Matt J (view profile)
@vikranth, I'm not sure I understand your question. What details are you looking for beyond the description above, the GettingStarted file in the .zip, or in the help doc?
vikranth akkidas (view profile)
can u give some details about the topic
Michael Völker (view profile)