Path: news.mathworks.com!not-for-mail
From: "John D'Errico" <woodchips@rochester.rr.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: inverse matrix inv pinv linfactor ginv
Date: Sun, 4 Oct 2009 09:59:03 +0000 (UTC)
Organization: John D'Errico (1-3LEW5R)
Lines: 122
Message-ID: <ha9rl7$gej$1@fred.mathworks.com>
References: <ha8r6f$pf7$1@fred.mathworks.com> <ha8s85$4qh$1@fred.mathworks.com>
Reply-To: "John D'Errico" <woodchips@rochester.rr.com>
NNTP-Posting-Host: webapp-05-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1254650343 16851 172.30.248.35 (4 Oct 2009 09:59:03 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Sun, 4 Oct 2009 09:59:03 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 869215
Xref: news.mathworks.com comp.soft-sys.matlab:574762


"Petros Mpogiatzis" <painter@geo.auth.gr> wrote in message <ha8s85$4qh$1@fred.mathworks.com>...
> "leo nidas" <bleonidas25@yahoo.gr> wrote in message <ha8r6f$pf7$1@fred.mathworks.com>...
> >   
> > Hi there,
> > 
> > I want to solve or b the : (X'X)*b=X'*y. I have some issues with the inverse of X'*X.
> > (vector b should estimate the vector [5  0.1048]. (regression coefficients))
> > MATLAB(R2007b).
> > 
> > In the beginning I used "inv" and got a warning:
> > 
> > Warning: Matrix is close to singular or badly scaled.
> >          Results may be inaccurate. RCOND = 1.000436e-148.
> > 
> > b =
> > 
> >     4.9346
> >     0.0000    (Inspitethe warning the solution seems to approach the real values)
> > 
> > I also used pinv but I am not sure about the result
> > b =
> > 
> >   1.0e-073 *
> > 
> >     0.0000
> >     0.8577  (Here the result looks weird, don't get a warning though)
> > 
> > I also used linfactor (Timothy A. Davis)
> > 
> > bnaive=linfactor (linfactor(Ximp'*Ximp),Ximp'*y')
> > Warning: Matrix is close to singular or badly scaled.
> >          Results may be inaccurate. RCOND = 3.483766e-074.
> > > In linfactor at 175
> > Warning: Matrix is close to singular or badly scaled.
> >          Results may be inaccurate. RCOND = 3.483766e-074.
> > > In linfactor at 175
> > 
> > b =
> > 
> >     4.9346
> >     0.0000 (Again a warning but a good looking solution)
> > 
> > And finally ginv (Luis Frank, jan 2009.)
> > 
> > and got either [5.4417  0]  or [1.0e-072 * 0        0.1171]  without warnings.
> > 
> > So I am a little confused..Which is correct?...Below I give the data. Note that X=[ones(n,1) x]. where n=300.
> > 
> > 
> > x=
> > 1.0e+073 *
> > 
> >     6.0217

(snip)

> >     6.2652
> >     7.1042
> 
> 
> use the backslash operator:
> b=(X'X)\X'y;

NO!!!!!

Don't just blindly use backslash on the normal equations!!!!!
Once you form the normal equations, you have already
screwed yourself.

Sigh.

Don't use the normal equations to solve a linear regression
problem in the first place. Solve it as

  b = X\y;

Or, solve it as

  b = pinv(X)*y;

This is the correct way to solve the linear regression problem.
When you use the normal equations, as people seem to teach
each other, INCORRECTLY, to solve a linear regression problem,
you worsen the conditioning of any regression that you will
solve.

But BEFORE you do that, scale this data properly, as I show
how below.

So next, we must deal with that little, inconsequential factor
of 1e73 on your data! Look at the variable x! See the 1e73
that comes out front. It says 

> > x=
> > 1.0e+073 *
> > 
> >     6.0217

This means that this regression will generally return garbage,
unless you are careful.

Don't just throw such unscaled data into a linear algebra
problem, and worse, then throw the normal equations at it!
It is this that is causing the other part of your problem. 

b = [ones(300,1),x/1e73]\y;
b =
       4.9346
     0.079934

Only now should you rescale b to compensate for the
prior scaling of x.

format long g
b(2) = b(2) * 1e73
b =
          4.93457076016252
      7.99340988034286e+71

See that no warning message was generated now.

John