Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: slope of eigenvector
Date: Tue, 11 Aug 2009 20:32:19 +0000 (UTC)
Organization: The MathWorks, Inc.
Lines: 23
Message-ID: <h5skgj$g3q$1@fred.mathworks.com>
References: <h5si1s$2u6$1@fred.mathworks.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-03-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1250022739 16506 172.30.248.38 (11 Aug 2009 20:32:19 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Tue, 11 Aug 2009 20:32:19 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1187260
Xref: news.mathworks.com comp.soft-sys.matlab:562519

"shinchan " <shinchan75034@gmail.com> wrote in message <h5si1s$2u6$1@fred.mathworks.com>...
> x=[.69 -1.31 .39 .09 1.29 .49 .19 -0.81 -.31 -.71];
> y=[.49 -1.21 .99 .29 1.09 .79 -.31 -.81 -.31 -1.01];
> 
> Given the data here, I understand that the first PCA axis (eigenvector) is a line that minimizes the square of the distance of each point to that line. So I use fminsearch to find the slope of this line:
> 
> slope  = fminsearch(@(param) (y - param*x)'*(y - param*x) + (x - y/param)'*(x - y/param), 1);  % slope = 1.0725.
> 
> 
> However, I can also eig() to get my eigenvectors and the slope I will get is not the same as above.
> 
> [vec val]=eig(cov(x,y));
> 
> %the slope of 1st PCA is
>  slope = vec(2,2)/vec(1,2); % slope = 1.0845.
> 
> Did I miss something here? I can't seem to figure out why these two methods gives me different slopes. I hope someone can help me out. Thanks.

  I can think of a couple of things amiss with your fminsearch formula.  First, you are minimizing, in effect, (y-p*x)^2*(1+p^2)/p^2, where p is param, when you should be minimizing (y-p*x)^2/(1+p^2).  The latter is proportional to the true mean squared distance to the line through the origin with slope p.

  Secondly, the line you want does not necessarily run through the origin.  You should subtract the mean values of x and y from x and y, respectively, before using them in the above formula.  It is, after all, a two-parameter problem: the line's slope and its distance from the origin.

Roger Stafford