Path: news.mathworks.com!not-for-mail
From: "Tom Lane" <tlane@mathworks.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: p-values
Date: Tue, 3 Nov 2009 13:04:41 -0500
Organization: The MathWorks, Inc
Lines: 29
Message-ID: <hcprbp$sbg$1@fred.mathworks.com>
References: <f5889743-7afb-4534-ae77-66c0026c0ce2@c3g2000yqd.googlegroups.com> <hcnlmp$o19$1@fred.mathworks.com> <68f39d38-3dd5-4bce-a514-3950873fbccd@v36g2000yqv.googlegroups.com>
Reply-To: "Tom Lane" <tlane@mathworks.com>
NNTP-Posting-Host: lanet.dhcp.mathworks.com
X-Trace: fred.mathworks.com 1257271481 29040 172.31.57.151 (3 Nov 2009 18:04:41 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Tue, 3 Nov 2009 18:04:41 +0000 (UTC)
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.5843
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579
X-RFC2646: Format=Flowed; Original
Xref: news.mathworks.com comp.soft-sys.matlab:582111


> I understand. I have another question which is a little more deeper
> than this. Suppose I have two vectors x1 and x2 and another vector y,
> now if x1 and x2 are independent of each other, (meaning corr(x,y) =
> 0, say), then I could find the correlation between my so called
> "features" x1 and x2 and "label" y separately in a straightforward
> fashion. However, my question is how to find the correlation if x1 and
> x2 are indeed dependent on each other. Wouldn't the correlation
> measure in this case calculated as corr(x1,y) and corr(x2,y) be biased
> or incorrect in this case??

Arun, I don't think I understand your concern.

Suppose you are interested in corr(x1,y). I could always generate another x2 
that is either correlated with x1 or not. How would my doing that cause your 
correlation to become biased?

There is a notion of multiple correlation. Its squared value is the R^2 
statistic for a regression. It measures the correlation between y and the 
linear combination of the x's obtained by regressing y on the x's.

There's also the notion of the partial correlation, where you measure the 
correlation between two variables after "removing" the effect of another 
variable.

I'm not sure if these two things are related to your concern, though.

-- Tom