Varimax Rotation on 'COEFF' matrix output from princomp command giving strange output

9 views (last 30 days)
I work at Columbia University Earth Institute, and I need to troubleshoot an output I am getting when I conduct a Varimax Rotation on my PCA outputs using the commands 'princomp' and 'rotatefactors' command:
my original data matrix "A" is a 59/34 matrix, where rows are observations and columns are variables. The Matrix has units that must be standardized. To do this I have used the following command: [COEFF, SCORE, latent] = princomp(zscore(A));
Based on the "rotatefactors" command instructions, to conduct a varimax rotation I then put my loadings matrix ("COEFF") into the command, i.e.: RotatedLoadings = rotatefactors(COEFF).
**My Question: My output from the rotate factors command does not look correct at all. The matrix returned to me is a 34X34 matrix of mostly zeroes. Each column as one "1". This is not the case when I use the same data in the command 'factoran', which gives me rotated loadings that (while I realize they should be slightly different) look far more accurate. However, for this work I need to use PCA.
Can someone advise on this? Why is my rotated loadings matrix incorrect when I use the PCA/rotatefactors commands? What am I doing incorrectly?
Thank you in advance!

Accepted Answer

Peter Perkins
Peter Perkins on 7 Jun 2012
Kaitlin, I think this is an artifact of your using the maximal number of PCs. Varimax attempts to find a rotation of your PCs such that each one is strongly correlated with as few of the original variables as possible. But since you have 34 variables and 34 (orthogonal) PC's that just means, "each rotated PC is one of the original variables." Which is to say, you've found the inverse to princomp. Usually, you'd want to throw away unimportant PCs to reduce the dimensionality of the data. You may have reasons for keeping all 34, but there's not much point in doing PCA if you are.
You have 59 observations on 34 variables, and so the PC coef matrix is 34x34. Consider this simpler example with 3 variables:
Generate a data cloud
rng default
mu = [1 2 3];
T = randn(3); Sigma = T*T';
X = mvnrnd(mu,Sigma,10);
Get the PC coefs and verify that they're orthonormal
coefs = princomp(zscore(X))
coefs'*coefs
Make a biplot of the three original variables against the three PCs. Each vector represents one of the three original variables, each axis represents one of the three PCs. You can see that the vectors are perpendicular to each other (rotate the plot interactively to see it better). That's because the PCs are orthogonal.
biplot(coefs)
Rotate all three coefs, and verify that they're still orthonormal.
rotatedCoefs3 = rotatefactors(coefs(:,1:3),'Method','varimax')
rotatedCoefs3'*rotatedCoefs3
A biplot of the rotated PCs demonstrates that varimax has rotated the axes of the first biplot so that each PC lines up exactly with one of the original variables.
biplot(rotatedCoefs3)
If you do the same thing, but retain only two PCs, you'll see what you were expecting. Hope this helps.
  8 Comments
Peter Perkins
Peter Perkins on 11 Jun 2012
There's a lot here, and it's complicated. Let me try to address at least some of what you ask.
* First of all, rotations.
A FA "solution" is unique only up to rotation -- for a fixed number of factors, any rigid rotation of one solution is another equally valid solution. The point of rotation is to come up with a solution whose factors can be explained in meaningful terms. Most often that means "each latent factor contributes heavily to one group of closely related measured variables." You seem to want to find a solution that is much more specific than that, where each latent factor contributes heavily to only one measured variable, and that doesn't seem realistic. unless you have one factor for each variable, which is pointless (and overparameterized in the FA model anyway).
A "full" PCA solution for P variables has P components, some of which typically contribute more than the others to the overall variation in the data (and if you have too few observations, some may contribute nothing). The components are ordered by the amount of variation the contribute to the data. If you rotate those components, you break that ordering. What you typically do with that full solution is to throw away most components and keep a very few that you can (for example) visualize, but that still "explain" most of the variation in the data. At that point, many people who come from the FA world want to rotate that reduced solution to be able interpret it. That's OK for interpretation, but with the caveat that you break the "variance ordering" by doing so. But once you have selected the number of components, you are free to rotate them, and indeed any rotation of the unrotated reduced solution is just as good at explaining your data (again, given the number of unrotated components).
So, in general, you would need to rotate either PCA or FA to get similar solutions. There is no reason to expect similar solutions without a rotation. The PROCRUSTESE function can be helpful in that respect -- it can rotate/reflect your PCA coefs to a beset fit of your FA loadings.
* Different algorithms and criteria.
Consider what PCA does: First it finds the linear combination of your measured variables that has the largest variance. That's principal component 1 -- column 1 of the coefs matrix. Then it finds the linear combination that is orthogonal to the first, and has the next largest variance. And so on, until you get P components. They are all orthogonal, so you can think of PCA as nothing more than a rotation of the original coordinate axes. It's easiest to think about this with an elogated cloud of points in 2- or 3-D. The point, however, is to notice what might happen: if you have one variable that has much more variance than all the others, your first PC will essentially be that one variable.
Now consider the FA model. It has the loadings matrix, and specific variances. It tries to fit the covariance matrix of your data with a matrix that looks like L*L' + Psi, where L is the (taller than wide) loadings matrix) and Psi is the (diagonal) specific variances matrix. FA can find correlations among you variables, and then assign any left-over "individual" variation to Psi. So if you have one variable with much larger variance than the others, it may not appear in L at all, but only in Psi.
You added complexity because by using PCA on standardized data, it's impossible for one variable to have a much larger variance than the others.
So yes, both PCA and FA return a "loading matrix", and both represent a new variable that is constructed as a linear combination of the original variables. But how they contribute to the mode can be quite different. The loadings in FA primarily describe the correlations among your original variables, the same is not necessarily true for the coefs from PCA.
What's going on in your data? I can't possibly answer that. Hope this help, best of luck.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!