score matrix is the principal components

6 views (last 30 days)
OK here is what i have understood about principal components in matlab. suppose i use "princomp" on a MxN data, then it returns me with a score matrix which is also MxN. But the first 3 columns of this score matrix gives the principal components of the original data which is also the Eigen vectors corresponding to the max eigen values of the covariance of the data.
Kindly correct me if my understanding is wrong.
Thanks

Accepted Answer

the cyclist
the cyclist on 29 Sep 2015
You are basically correct. It is not just the first 3 vectors that are principal components, but all the vectors are the complete set of principal components. They reported in the decreasing order of variance explained (which is decreasing order of magnitude of eigenvalue).
Also, you might want to use the pca command rather than princomp. According to the documentation, they may remove princomp.
  3 Comments
the cyclist
the cyclist on 30 Sep 2015
Edited: the cyclist on 30 Sep 2015
If you read the documentation for cov, you'll see that the expected input (assuming a matrix input) has each column being a variable, and each row is a set of observations of those variables. The ' in your second expression is the ctranspose operator, which takes the transpose of the matrix (and also takes the complex conjugate, but I am guessing you do not have complex number entries). So, you would use that if your data array had the rows as variables, and the columns as observations.
One other thing to keep in mind is that princomp(X) automatically subtracts the mean from each columns before processing, so that will affect your calculation, if you are trying to do some manipulations with the principal components. (This is described in the documentation.)
the cyclist
the cyclist on 30 Sep 2015
I spent some time getting a little deeper into this. It is actually the "coeff" output that is the principal component vectors. "score" is the projection of the original data onto the principal component axes.
I wrote the following code to try to solidify my understanding. I am not 100% certain that this is all correct, so be careful, but I think it is all good.
rng 'default'
M = 7; % Number of observations
N = 5; % Number of variables observed
X = rand(M,N);
% De-mean
X = bsxfun(@minus,X,mean(X));
% Do the PCA
[coeff,score,latent] = pca(X);
% Calculate eigenvalues and eigenvectors of the covariance matrix
covarianceMatrix = cov(X);
[V,D] = eig(covarianceMatrix);
% "coeff" are the principal component vectors. These are the eigenvectors of the covariance matrix. Compare ...
coeff
V
% Multiply the original data by the principal component vectors to get the projections of the original data on the
% principal component vector space. This is also the output "score". Compare ...
dataInPrincipalComponentSpace = X*coeff
score
% The columns of X*coeff are orthogonal to each other. This is shown with ...
corrcoef(dataInPrincipalComponentSpace)
% The variances of these vectors are the eigenvalues of the covariance matrix, and are also the output "latent". Compare
% these three outputs
var(dataInPrincipalComponentSpace)'
latent
sort(diag(D),'descend')

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!