Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: curious problem with sparse matrices that have explicit "0" entries.....
Date: Wed, 7 Jan 2009 22:21:02 +0000 (UTC)
Organization: Xoran Technologies
Lines: 59
Message-ID: <gk39se$2kp$1@fred.mathworks.com>
References: <Xns9B885ED987C87ejhericholtmamcom@216.168.3.30> <gjou46$45r$1@fred.mathworks.com> <gjqnjj$jjd$1@fred.mathworks.com> <gk0j44$46r$1@fred.mathworks.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-03-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1231366862 2713 172.30.248.38 (7 Jan 2009 22:21:02 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Wed, 7 Jan 2009 22:21:02 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1440443
Xref: news.mathworks.com comp.soft-sys.matlab:510315


"Tim Davis" <davis@cise.ufl.edu> wrote in message <gk0j44$46r$1@fred.mathworks.com>...
> "Matt " <mjacobson.removethis@xorantech.com> wrote in message <gjqnjj$jjd$1@fred.mathworks.com>...
> > "Tim Davis" <davis@cise.ufl.edu> wrote in message <gjou46$45r$1@fred.mathworks.com>...
> > 
> > > Now, I would argue that it's a pity MATLAB doesn't allow explicit
> > > zeros (they happen quite naturally, when solving a nonlinear
> > > system for example).  
> > 
> > But surely the arguments for disallowing them have some substance. It's obviously more storage-efficient. It also allows a more optimal implementation of things like find(). As we've seen, if explicit zeros were allowed, the find() function would need to do additional work to weed out the  indices of explicit zero entries from the final output.
> > 
> >  
> 
> There are some arguments for excluding them, but it would be better if MATLAB could include them, and allow you to exclude them if needed.
> 
> Exact explicit zeros are rare, at least when doing x=A\b.  The problem arises when a few entries are dropped.  The pattern changes, which takes time to reconstruct.  What's worse, it means that MATLAB could not be modified to analyze a system once (doing the ordering, for example) for a sequence of related matrices (with "same" nonzero pattern but different numerical values).  An entry that becomes zero will most likely become nonzero the very next iteration of a nonlinear solver.
> 
> Try this code:
> 
> A = sprand (1000,1000,0.1) ;
> for k = 1:10000
> A(1,1) = 1 ;
> A(1,1) = 0 ;
> end
> 
> and you will be shocked to find out how much time it takes.  Each iteration takes O(nnz(A)) time, just to squeeze out a single explicit zero entry.
------------------------------------------------------------------------------------------

It is a long time, but I guess I shouldn't be shocked. Since the table of explicit values has changed in size,  MATLAB must re-allocate memory for it every time.


> 
> 
> There are other issues.  For example, suppose you want to "precompute" the submatrix assigment A(i,j)=x.  That statement requires a search of where A(i,j) is.  If it could be guaranteed that the pattern wouldn't change, then the statement could be done in O(nnz(x)) time.  As MATLAB stands now, that is not feasible.
> 
> Finally, there are some algorithms that cannot be exploited because of explicit zero dropping.  The pattern of L from a Cholesky factorization has a very specific property ("chordal").  Exploiting that property in x=L\b or cholupdate can can cut the time in half.  Problem is, though, that if even a single entry is dropped (or worse, if you don't know if an entry is dropped or not), then the chordal property is no longer guaranteed.  As a result, x=L\b is slower than it could be (by a factor of 2) when L comes from a sparse Cholesky factorization.  Likewise, a sparse cholupdate is made nearly infeasible by the policy of always dropping explicit zeros.
> 
> It's also likely that x=A*b could be faster if A had some special property to its nonzero pattern that could be exploited ... but can't if explicit zeros get dropped, thereby removing that special property.
> 
----------------------

OK.


> The "find" command would be trivial to modify to allow its input matrix to include explicit zeros, in any case, with no real increase in the amount of work it would need to do.
> 
-----------------------

I guess I don't know what you consider a "real increase". In a world where explicit zeros were allowed, the find() function would need to probe the explicit values list and test to see which of them are zero. Currently, it doesn't need to do this extra work.

Granted though, the same probing for zeros is already done in other operations to enforce the no-explicit-zeros rule, so one could say the burden has simply been shifted elsewhere.

> But all this discussion is moot, anyway, since MATLAB probably can't be easily changed to allow explicit zeros to be kept.
> 
-----------------------------------------------

Can't easily be changed, but a new sparse matrix data class could be created alongside the old one, one with new class methods exploiting the capabilities of explicit zeros that you describe.

I would guess this to be the most principled approach because, as you've said, sometimes methods and operations that keep explicit zeros are optimal and sometimes the converse. To have alternative method implementation you to have represent the input as a different class. That's assuming you want to keep the same interface...