Accessing Sparse Matrices

Nonzero Elements

There are several commands that provide high-level information about the nonzero elements of a sparse matrix:

  • nnz returns the number of nonzero elements in a sparse matrix.

  • nonzeros returns a column vector containing all the nonzero elements of a sparse matrix.

  • nzmax returns the amount of storage space allocated for the nonzero entries of a sparse matrix.

To try some of these, load the supplied sparse matrix west0479, one of the Harwell-Boeing collection.

load west0479

  Name            Size             Bytes  Class     Attributes

  M_full       1100x1100         9680000  double              
  M_sparse     1100x1100            5004  double    sparse    
  west0479      479x479            24564  double    sparse    

This matrix models an eight-stage chemical distillation column.

Try these commands.


ans =

format short e
west0479 =
  (25,1)      1.0000e+00
  (31,1)     -3.7648e-02
  (87,1)     -3.4424e-01
  (26,2)      1.0000e+00
  (31,2)     -2.4523e-02
  (88,2)     -3.7371e-01
  (27,3)      1.0000e+00
  (31,3)     -3.6613e-02
  (89,3)     -8.3694e-01
  (28,4)      1.3000e+02

    ans =

    Note:   Use Ctrl+C to stop the nonzeros listing at any time.

Note that initially nnz has the same value as nzmax by default. That is, the number of nonzero elements is equivalent to the number of storage locations allocated for nonzeros. However, MATLAB® does not dynamically release memory if you zero out additional array elements. Changing the value of some matrix elements to zero changes the value of nnz, but not that of nzmax.

However, you can add as many nonzero elements to the matrix as desired. You are not constrained by the original value of nzmax.

Indices and Values

For any matrix, full or sparse, the find function returns the indices and values of nonzero elements. Its syntax is

[i,j,s] = find(S)

find returns the row indices of nonzero values in vector i, the column indices in vector j, and the nonzero values themselves in the vector s. The example below uses find to locate the indices and values of the nonzeros in a sparse matrix. The sparse function uses the find output, together with the size of the matrix, to recreate the matrix.

[i,j,s] = find(S)
[m,n] = size(S)
S = sparse(i,j,s,m,n)

Indexing in Sparse Matrix Operations

Because sparse matrices are stored in compressed sparse column format, there are different costs associated with indexing into a sparse matrix than there are with indexing into a full matrix. Such costs are negligible when you need to change only a few elements in a sparse matrix, so in those cases it's normal to use regular array indexing to reassign values:

B = speye(4);
[i,j,s] = find(B);

ans =

     1     1     1
     2     2     1
     3     3     1
     4     4     1

B(3,1) = 42;
[i,j,s] = find(B);

ans =

     1     1     1
     3     1    42
     2     2     1
     3     3     1
     4     4     1
In order to store the new matrix with 42 at (3,1), MATLAB inserts an additional row into the nonzero values vector and subscript vectors, then shifts all matrix values after (3,1).

Using linear indexing to access or assign an element in a large sparse matrix will fail if the linear index exceeds intmax.

S = spalloc(2^30,2^30,2)
S(end) = 1
Maximum variable size allowed by the program is exceeded.

To access an element whose linear index is greater than intmax, use array indexing:

S(2^30,2^30) = 1

S =

         (1073741824,1073741824)              1

While the cost of indexing into a sparse matrix to change a single element is negligible, it is compounded in the context of a loop and can become quite slow for large matrices. For that reason, in cases where many sparse matrix elements need to be changed, it is best to vectorize the operation instead of using a loop. For example, consider a sparse identity matrix:

n = 10000;
A = 4*speye(n);
Changing the elements of A within a loop takes considerably more time than a similar vectorized operation:
tic; A(1:n-1,n) = -1; A(n,1:n-1) = -1; toc
Elapsed time is 0.001899 seconds.

tic; for k = 1:n-1, C(k,n) = -1; C(n,k) = -1; end, toc
Elapsed time is 0.286563 seconds.
The difference in execution time in this example is over two orders of magnitude. Since MATLAB stores sparse matrices in compressed sparse column format, it needs to shift multiple entries in A during each pass through the loop.

Preallocating the memory for a sparse matrix and then filling it in an element-wise manner similarly causes a significant amount of overhead in indexing into the sparse array:

S1 = spalloc(1000,1000,100000);
for n = 1:100000
    i = ceil(1000*rand(1,1));
    j = ceil(1000*rand(1,1));
    S1(i,j) = rand(1,1);

Elapsed time is 26.281000 seconds.

Constructing the vectors of indices and values eliminates the need to index into the sparse array, and thus is significantly faster:

i = ceil(1000*rand(100000,1));
j = ceil(1000*rand(100000,1));
v = zeros(size(i));
for n = 1:100000
    v(n) = rand(1,1);

S2 = sparse(i,j,v,1000,1000);

Elapsed time is 0.078000 seconds.

For that reason, it's best to construct sparse matrices all at once using a construction function, like the sparse or spdiags functions.

For example, suppose you wanted the sparse form of the coordinate matrix C:


Construct the five-column matrix directly with the sparse function using the triplet pairs for the row subscripts, column subscripts, and values:

i = [1 5 2 5 3 5 4 5 1 2 3 4 5]';
j = [1 1 2 2 3 3 4 4 5 5 5 5 5]';
s = [4 1 4 1 4 1 4 1 -1 -1 -1 -1 4]';
C = sparse(i,j,s)
C =

   (1,1)        4
   (5,1)        1
   (2,2)        4
   (5,2)        1
   (3,3)        4
   (5,3)        1
   (4,4)        4
   (5,4)        1
   (1,5)       -1
   (2,5)       -1
   (3,5)       -1
   (4,5)       -1
   (5,5)        4
The ordering of the values in the output reflects the underlying storage by columns. For more information on how MATLAB stores sparse matrices, see John R. Gilbert, Cleve Moler, and Robert Schreiber's Sparse Matrices In Matlab: Design and Implementation, (SIAM Journal on Matrix Analysis and Applications, 13:1, 333–356 (1992)).

Visualizing Sparse Matrices

It is often useful to use a graphical format to view the distribution of the nonzero elements within a sparse matrix. The MATLAB spy function produces a template view of the sparsity structure, where each point on the graph represents the location of a nonzero array element.

For example:

Load the supplied sparse matrix west0479, one of the Harwell-Boeing collection.

load west0479

View the sparsity structure.


Was this topic helpful?