Path: news.mathworks.com!newsfeed-00.mathworks.com!newsfeed2.dallas1.level3.net!news.level3.com!postnews.google.com!h30g2000vbr.googlegroups.com!not-for-mail
From: arun <aragorn168b@gmail.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: size(sparse matrix) > size(full matrix)
Date: Thu, 17 Sep 2009 08:25:39 -0700 (PDT)
Organization: http://groups.google.com
Lines: 96
Message-ID: <c6a3a485-e8b4-459b-8a67-5ed0e5b16a22@h30g2000vbr.googlegroups.com>
References: <8e83f86c-df5e-413d-9943-da36c83a66d2@g1g2000vbr.googlegroups.com> 
	<a56978d9-99f2-4ff6-8073-a77ab61bb6d6@o21g2000vbl.googlegroups.com> 
	<h8t5eb$anh$1@fred.mathworks.com> <3f28c1c1-af9b-458a-a802-25e66f1bc1bb@p15g2000vbl.googlegroups.com> 
	<h8td5b$bld$1@fred.mathworks.com>
NNTP-Posting-Host: 192.124.26.250
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1253201139 22404 127.0.0.1 (17 Sep 2009 15:25:39 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Thu, 17 Sep 2009 15:25:39 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: h30g2000vbr.googlegroups.com; posting-host=192.124.26.250; 
	posting-account=fyqXpgoAAABqt-0BifyaNxmZhzggFACu
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; 
	rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3,gzip(gfe),gzip(gfe)
Xref: news.mathworks.com comp.soft-sys.matlab:571111


On Sep 17, 3:22 pm, "Steven Lord" <sl...@mathworks.com> wrote:
> "arun" <aragorn1...@gmail.com> wrote in message
>
> news:3f28c1c1-af9b-458a-a802-25e66f1bc1bb@p15g2000vbl.googlegroups.com...
>
>
>
> > On Sep 17, 1:10 pm, "Bruno Luong" <b.lu...@fogale.findmycountry>
> wrote:
> > > Rune Allnor <all...@tele.ntnu.no> wrote in message
> > > <a56978d9-99f2-4ff6-8073-a77ab61bb...@o21g2000vbl.googlegroups.com>...
>
> > > > Sparse matrices are not used to gain speed, but to save
> > > > space.
>
> > > Wrong, many linear algebra algorithms are designed to gain speed with
> > > sparse matrix since Matrix-vector product is much faster than for full
> > > matrix. That lead to the class of iterative method (gradient conjugate,
> > > GMRES, Lanczos, etc...) that is suitable when working with sparse
> > > matrix. The full matrix is not even competitive, even for speed.
>
> > > Bruno
>
> > Bruno,
> > Yes, its true, however I see that when operations involving sparse-
> > matrices result in a more-or-less full matrix (as in my case at some
> > intermediate point), it becomes slow. But as Rune pointed out, my data
> > set is quite huge 1200 * 312000. So I had to split them and I thought
> > I could also save more space by storing sparse as my calculations are
> > faster already and shouldn't take much *even if* sparse makes it
> > slow.
>
> What types of operations are you doing on this sparse matrix?  If you're
> working along the rows, I recommend you store your matrix transposed and
> work down the columns.  The size in memory of a sparse matrix depends on the
> number of columns in the matrix, not the number of rows, and due to the way
> it's stored in memory retrieving a column of a sparse matrix is likely to be
> faster and more efficient than retrieving a row.
>
> --
> Steve Lord
> sl...@mathworks.com
> comp.soft-sys.matlab (CSSM) FAQ:http://matlabwiki.mathworks.com/MATLAB_FAQ

Hi Steve,

1) My calculations:
Initially, from each of the column n*1 , I calculate a matrix A (n*n)
and then take the trace(A*B) where B is a constant pre-computed n*n
matrix.

However, with help from CSSM user Matt, I managed to vectorize the
code so that I calculate the trace of all columns without computing
for each separately using for-loops.

2) I already am working on columns. That is why I had to have 1200*x
as the size. I understand if I transpose the size of the matrix will
be very less when stored sparse, but I will have to transpose it again
to perform my trace computations!!

Hi Sebastiaan,

Thank you for your reply. It was extremely informative. I understand
the disadvantages of storing with huge number of columns. However, as
I explained above in (2) my operations are along the columns. So, the
best I could do is to store as a full-matrix... I guess.
Regarding the confusion between 1200*39000 and 1200*311200, My whole
data set is 1222*311398 to be precise. And I split them into 8 parts,
7 * 1222*38924) and the last as 1222*38930.
I load each of these from mat-files and do my computations. Just to
try, I transposed my 1222*38924 matrix and then created sparse array,
but it din't help much. Which suggests that it is not wise to store it
as a sparse array as it is *not really sparse*.

  Name          Size                   Bytes  Class     Attributes

  gt         1222x38924            380521024  double
  gt1        1222x38924            761343080  double    sparse
  gt2       38924x1222             761041464  double    sparse

And this seems to be true. The problem is that, my data set has 0, 0.5
and 1. There are only 648 0's but 38423871 0.5's and 9140609 1's. I
guess sparse doesn't look for most-frequently occurring element and
stores the remaining... but just stores the non-zero values.. yes, it
is by definition.

thank you very much.
best, arun.