This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.


Principal component analysis (PCA) on data

princomp has been removed. Use pca instead.


[COEFF,SCORE] = princomp(X)
[COEFF,SCORE,latent] = princomp(X)
[COEFF,SCORE,latent,tsquare] = princomp(X)
[...] = princomp(X,'econ')


COEFF = princomp(X) performs principal components analysis (PCA) on the n-by-p data matrix X, and returns the principal component coefficients, also known as loadings. Rows of X correspond to observations, columns to variables. COEFF is a p-by-p matrix, each column containing coefficients for one principal component. The columns are in order of decreasing component variance.

princomp centers X by subtracting off column means, but does not rescale the columns of X. To perform principal components analysis with standardized variables, that is, based on correlations, use princomp(zscore(X)). To perform principal components analysis directly on a covariance or correlation matrix, use pcacov.

[COEFF,SCORE] = princomp(X) returns SCORE, the principal component scores; that is, the representation of X in the principal component space. Rows of SCORE correspond to observations, columns to components.

[COEFF,SCORE,latent] = princomp(X) returns latent, a vector containing the eigenvalues of the covariance matrix of X.

[COEFF,SCORE,latent,tsquare] = princomp(X) returns tsquare, which contains Hotelling's T2 statistic for each data point.

The scores are the data formed by transforming the original data into the space of the principal components. The values of the vector latent are the variance of the columns of SCORE. Hotelling's T2 is a measure of the multivariate distance of each observation from the center of the data set.

When n <= p, SCORE(:,n:p) and latent(n:p) are necessarily zero, and the columns of COEFF(:,n:p) define directions that are orthogonal to X.

[...] = princomp(X,'econ') returns only the elements of latent that are not necessarily zero, and the corresponding columns of COEFF and SCORE, that is, when n <= p, only the first n-1. This can be significantly faster when p is much larger than n.


[1] Jackson, J. E., A User's Guide to Principal Components, John Wiley and Sons, 1991, p. 592.

[2] Jolliffe, I. T., Principal Component Analysis, 2nd edition, Springer, 2002.

[3] Krzanowski, W. J. Principles of Multivariate Analysis: A User's Perspective. New York: Oxford University Press, 1988.

[4] Seber, G. A. F., Multivariate Observations, Wiley, 1984.

Introduced before R2006a

Was this topic helpful?