'pca' vs 'svd' or 'eig' functions

23 views (last 30 days)
Hi,
I am trying to generate the principal components from a set of data. However, i get an entirely different result when i use the 'pca' function compared to the 'eig' function. The 'eig' function gives the same results as the 'svd' function for my data.
I am using the raw data as input into the 'pca' function.
For 'eig' - I am calculating the correlation matrix and then using that as input into the 'eig' function.
I am very puzzled on why i get different results and would be grateful for your help! Code below:
testmat = rand(20,5);
testcorrelMat = corr(testmat);
testeig = eig(testcorrelMat);
testsvd = svd(testcorrelMat);
[testcoeff, ~, testlatent] = pca(testmat);
[sort(testsvd), sort(testeig), sort(testlatent)]

Accepted Answer

the cyclist
the cyclist on 16 Mar 2021
You will get the same result from pca() if you standardize the input data first:
rng default
testmat = rand(20,5);
% Standardize the data
testmat = (testmat - mean(testmat))./std(testmat);
testcorrelMat = corr(testmat);
testeig = eig(testcorrelMat);
testsvd = svd(testcorrelMat);
[testcoeff, ~, testlatent] = pca(testmat);
[sort(testsvd), sort(testeig), sort(testlatent)]
ans = 5×3
0.2238 0.2238 0.2238 0.6422 0.6422 0.6422 0.8504 0.8504 0.8504 1.4606 1.4606 1.4606 1.8229 1.8229 1.8229
  2 Comments
Steven Lord
Steven Lord on 16 Mar 2021
To normalize the data you can use the normalize function to normalize by 'zscore' (which is the default normalization method.)
rng default
testmat = rand(20,5);
% Standardize the data
testmat = normalize(testmat);
testcorrelMat = corr(testmat);
testeig = eig(testcorrelMat);
testsvd = svd(testcorrelMat);
[testcoeff, ~, testlatent] = pca(testmat);
results = [sort(testsvd), sort(testeig), sort(testlatent)]
results = 5×3
0.2238 0.2238 0.2238 0.6422 0.6422 0.6422 0.8504 0.8504 0.8504 1.4606 1.4606 1.4606 1.8229 1.8229 1.8229
format longg
results - results(:, 1)
ans = 5×3
0 1.11022302462516e-16 -1.94289029309402e-16 0 4.44089209850063e-16 -9.99200722162641e-16 0 -1.11022302462516e-16 3.33066907387547e-16 0 -1.33226762955019e-15 -1.55431223447522e-15 0 0 -8.88178419700125e-16
Looks pretty good to me.
Pranav Aggarwal
Pranav Aggarwal on 18 Mar 2021
Thanks Steven and 'the cyclist' - solved!

Sign in to comment.

More Answers (0)

Tags

Products


Release

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!