Path: news.mathworks.com!not-for-mail
From: "Tim Davis" <davis@cise.ufl.edu>
Newsgroups: comp.soft-sys.matlab
Subject: Re: curious problem with sparse matrices that have explicit "0" entries.....
Date: Tue, 6 Jan 2009 21:40:20 +0000 (UTC)
Organization: University of Florida
Lines: 37
Message-ID: <gk0j44$46r$1@fred.mathworks.com>
References: <Xns9B885ED987C87ejhericholtmamcom@216.168.3.30> <gjou46$45r$1@fred.mathworks.com> <gjqnjj$jjd$1@fred.mathworks.com>
Reply-To: "Tim Davis" <davis@cise.ufl.edu>
NNTP-Posting-Host: webapp-02-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1231278020 4315 172.30.248.37 (6 Jan 2009 21:40:20 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Tue, 6 Jan 2009 21:40:20 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 45902
Xref: news.mathworks.com comp.soft-sys.matlab:510121


"Matt " <mjacobson.removethis@xorantech.com> wrote in message <gjqnjj$jjd$1@fred.mathworks.com>...
> "Tim Davis" <davis@cise.ufl.edu> wrote in message <gjou46$45r$1@fred.mathworks.com>...
> 
> > Now, I would argue that it's a pity MATLAB doesn't allow explicit
> > zeros (they happen quite naturally, when solving a nonlinear
> > system for example).  
> 
> But surely the arguments for disallowing them have some substance. It's obviously more storage-efficient. It also allows a more optimal implementation of things like find(). As we've seen, if explicit zeros were allowed, the find() function would need to do additional work to weed out the  indices of explicit zero entries from the final output.
> 
>  

There are some arguments for excluding them, but it would be better if MATLAB could include them, and allow you to exclude them if needed.

Exact explicit zeros are rare, at least when doing x=A\b.  The problem arises when a few entries are dropped.  The pattern changes, which takes time to reconstruct.  What's worse, it means that MATLAB could not be modified to analyze a system once (doing the ordering, for example) for a sequence of related matrices (with "same" nonzero pattern but different numerical values).  An entry that becomes zero will most likely become nonzero the very next iteration of a nonlinear solver.

Try this code:

A = sprand (1000,1000,0.1) ;
for k = 1:10000
A(1,1) = 1 ;
A(1,1) = 0 ;
end

and you will be shocked to find out how much time it takes.  Each iteration takes O(nnz(A)) time, just to squeeze out a single explicit zero entry.


There are other issues.  For example, suppose you want to "precompute" the submatrix assigment A(i,j)=x.  That statement requires a search of where A(i,j) is.  If it could be guaranteed that the pattern wouldn't change, then the statement could be done in O(nnz(x)) time.  As MATLAB stands now, that is not feasible.

Finally, there are some algorithms that cannot be exploited because of explicit zero dropping.  The pattern of L from a Cholesky factorization has a very specific property ("chordal").  Exploiting that property in x=L\b or cholupdate can can cut the time in half.  Problem is, though, that if even a single entry is dropped (or worse, if you don't know if an entry is dropped or not), then the chordal property is no longer guaranteed.  As a result, x=L\b is slower than it could be (by a factor of 2) when L comes from a sparse Cholesky factorization.  Likewise, a sparse cholupdate is made nearly infeasible by the policy of always dropping explicit zeros.

It's also likely that x=A*b could be faster if A had some special property to its nonzero pattern that could be exploited ... but can't if explicit zeros get dropped, thereby removing that special property.

But all this discussion is moot, anyway, since MATLAB probably can't be easily changed to allow explicit zeros to be kept.

The "find" command would be trivial to modify to allow its input matrix to include explicit zeros, in any case, with no real increase in the amount of work it would need to do.

The reason explicit zeros are always dropped is that sometimes it is the right thing to do.  For example, if you want to really delete 1/2 the entries of a matrix, or more, then it makes sense to drop explicit zeros automatically, to save space.  But there are lots of reasons why that design decision wasn't a good one (as given above).