Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Interpreting PCA results: correlation coeff close to zero. but both

Subject: Interpreting PCA results: correlation coeff close to zero. but both

From: HL

Date: 9 Jul, 2011 00:12:40

Message: 1 of 2

Hi All,

I have trouble in interpreting PCA result. As I know a lot of people
do PCA use matlab, I hope I can get some advice here.

Here is the correlation coefficients matrix of 6 time series
VAR 1 VAR 2 VAR 3 VAR 4 VAR 5 VAR 6
VAR 1 1 0.86924 0.72059 0.93435 0.94902 0.60563
VAR 2 0.86924 1 0.56722 0.82527 0.86984 0.74556
VAR 3 0.72059 0.56722 1 0.81599 0.71481 0.04097
VAR 4 0.93435 0.82527 0.81599 1 0.90632 0.54128
VAR 5 0.94902 0.86984 0.71481 0.90632 1 0.68061
VAR 6 0.60563 0.74556 0.04097 0.54128 0.68061 1

note that the correlation coeff for VAR3 and VAR6 is small, almost
zero (0.04)

I can get eigen values and vectors
Eigen Vectors F1 F2 F3 F4 F5 F6
VAR 1 -0.44868 0.044975 0.22607 0.66658 0.39593 0.38009
VAR 2 -0.42826 -0.19755 -0.86091 0.0869 0.040701 -0.16492
VAR 3 -0.34321 0.64797 -0.11832 -0.30814 -0.34259 0.48583
VAR 4 -0.44316 0.16762 0.23468 -0.49622 0.56209 -0.39782
VAR 5 -0.45069 -0.019026 0.31936 0.24333 -0.62144 -0.49912
VAR 6 -0.31301 -0.71458 0.19148 -0.38436 -0.14859 0.43002

Eigen Values 4.6795 1.022 0.13623 0.081882 0.075441 0.0049339


Square each element in the eigen vector matrix weighted by respective
eigen values, we get the The "factor loading matrix square"
F1 F2 F3 F4 F5 F6
VAR 1 0.94205 0.0020673 0.0069622 0.036382 0.011826 0.0007128
VAR 2 0.85827 0.039884 0.10097 0.00061834 0.00012497 0.00013419
VAR 3 0.5512 0.4291 0.0019073 0.0077745 0.0088546 0.0011645
VAR 4 0.919 0.028715 0.0075029 0.020162 0.023835 0.00078086
VAR 5 0.95052 0.00036995 0.013894 0.0048483 0.029134 0.0012291
VAR 6 0.45847 0.52186 0.0049948 0.012097 0.0016658 0.00091238

Note that VAR3 and VAR6 each has project almost 50% of its respective
variances
on the F2 direction. I cannot make sense of this as there correlation
coefficient is almost zero.

Thank you very much for your input and advice.

Chong

Subject: Interpreting PCA results: correlation coeff close to zero. but both

From: Roger Stafford

Date: 9 Jul, 2011 05:57:09

Message: 2 of 2

HL <highlanderda@gmail.com> wrote in message <86daff2a-9ad7-4ad3-96e4-655373fcb871@g16g2000yqg.googlegroups.com>...
> I have trouble in interpreting PCA result. As I know a lot of people
> do PCA use matlab, I hope I can get some advice here.
> .......... SNIP .........
> note that the correlation coeff for VAR3 and VAR6 is small, almost
> zero (0.04)
> .......... SNIP ........
> Note that VAR3 and VAR6 each has project almost 50% of its respective
> variances
> on the F2 direction. I cannot make sense of this as there correlation
> coefficient is almost zero.
- - - - - - - - - - -
  This is not necessarily remarkable. Here's an artificial situation I've cooked up to illustrate the point. Suppose we have two pennies and four nickels which we flip simultaneously. However these are very special coins that have the following strange highly correlated statistics.

prob. pennies nickels temperature (not entered into correlation matrix)
 1/4 H H H H H H warm
 1/4 T T T T T T warm
 1/8 H T H H H H cool
 1/8 T H H H H H cool
 1/8 H T T T T T cool
 1/8 T H T T T T cool

No other combinations can occur. In other words half the time it is warm and they are all alike and half the time it is cool and the pennies are opposite one another with the nickels still all alike. If we score heads as +1 and tails -1 it can be seen that the correlation matrix would be this:

A = 1.0000 0 0.5000 0.5000 0.5000 0.5000
         0 1.0000 0.5000 0.5000 0.5000 0.5000
    0.5000 0.5000 1.0000 1.0000 1.0000 1.0000
    0.5000 0.5000 1.0000 1.0000 1.0000 1.0000
    0.5000 0.5000 1.0000 1.0000 1.0000 1.0000
    0.5000 0.5000 1.0000 1.0000 1.0000 1.0000

with the two pennies having a correlation of zero. The eigenvalues and corresponding eigenvectors of A would be

    4.5616 1.0000 0.4384 0.0000 -0.0000 0.0000

    0.2610 0.7071 0.6572 -0.0000 -0.0000 0.0000
    0.2610 -0.7071 0.6572 -0.0000 -0.0000 0.0000
    0.4647 0.0000 -0.1845 0.5888 -0.6280 -0.0943
    0.4647 0.0000 -0.1845 -0.7949 -0.3416 -0.0373
    0.4647 -0.0000 -0.1845 0.0943 0.5817 -0.6346
    0.4647 0.0000 -0.1845 0.1118 0.3879 0.7662

  Notice that even though there is zero correlation between the two pennies in the original correlation matrix, in the second eigenvector the variation is entirely ascribable to the pennies which have opposite signs. The eigenvector analysis has managed to divide up the two kinds of results above which correspond to the temperature without ever seeing the original probability statistics (or a thermometer.) In the one the pennies are both alike and in the other they are opposite.

  I claim this is similar to the case you describe. It is possible to be deceived by looking only at an original correlation matrix. There may be strong underlying relationships between variables that have small correlation values.

Roger Stafford

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us