Principal component analysis on covariance matrix
pcacov does not standardize
V to have unit variances. To perform principal component analysis on standardized variables, use the correlation matrix
R = V./(SD*SD'), where
SD = sqrt(diag(V)), in place of
V. To perform principal component analysis directly on the data matrix, use
Perform Principal Component Analysis on Covariance Matrix
Create a covariance matrix from the
load hald covx = cov(ingredients);
Perform principal component analysis on the
[coeff,latent,explained] = pcacov(covx)
coeff = 4×4 -0.0678 -0.6460 0.5673 0.5062 -0.6785 -0.0200 -0.5440 0.4933 0.0290 0.7553 0.4036 0.5156 0.7309 -0.1085 -0.4684 0.4844
latent = 4×1 517.7969 67.4964 12.4054 0.2372
explained = 4×1 86.5974 11.2882 2.0747 0.0397
The first component explains over 85% of the total variance. The first two components explain nearly 98% of the total variance.
V — Covariance matrix
square, symmetric, positive semidefinite matrix
Covariance matrix, specified as a square, symmetric, positive semidefinite matrix.
coeff — Principal component coefficients
Principal component coefficients, returned as a matrix the same size as
V. Each column of
coeff contains coefficients for one principal component. The columns are in order of decreasing component variance.
latent — Principal component variances
Principal component variances, returned as a vector with length equal to
size(coeff,1). The vector
latent contains the eigenvalues of
explained — Percentage of total variance explained by each principal component
Percentage of the total variance explained by each principal component, returned as a vector the same size as
latent. The entries in
explained range from 0 (none of the variance is explained) to 100 (all of the variance is explained).
 Jackson, J. E. A User's Guide to Principal Components. Hoboken, NJ: John Wiley and Sons, 1991.
 Jolliffe, I. T. Principal Component Analysis. 2nd ed. New York: Springer-Verlag, 2002.
 Krzanowski, W. J. Principles of Multivariate Analysis: A User's Perspective. New York: Oxford University Press, 1988.
 Seber, G. A. F. Multivariate Observations, Wiley, 1984.
Calculate with arrays that have more rows than fit in memory.
do not work directly on tall arrays. Instead, use
C = gather(cov(X)) to
compute the covariance matrix of a tall array. Then, you can use
factoran to work on the in-memory covariance matrix. Alternatively, you
pca directly on a tall array.
For more information, see Tall Arrays for Out-of-Memory Data.