canoncorr

Canonical correlation

Syntax

[A,B] = canoncorr(X,Y)
[A,B,r] = canoncorr(X,Y)
[A,B,r,U,V] = canoncorr(X,Y)
[A,B,r,U,V,stats] = canoncorr(X,Y)

Description

[A,B] = canoncorr(X,Y) computes the sample canonical coefficients for the n-by-d1 and n-by-d2 data matrices X and Y. X and Y must have the same number of observations (rows) but can have different numbers of variables (columns). A and B are d1-by-d and d2-by-d matrices, where d = min(rank(X),rank(Y)). The jth columns of A and B contain the canonical coefficients, i.e., the linear combination of variables making up the jth canonical variable for X and Y, respectively. Columns of A and B are scaled to make the covariance matrices of the canonical variables the identity matrix (see U and V below). If X or Y is less than full rank, canoncorr gives a warning and returns zeros in the rows of A or B corresponding to dependent columns of X or Y.

[A,B,r] = canoncorr(X,Y) also returns a 1-by-d vector containing the sample canonical correlations. The jth element of r is the correlation between the jth columns of U and V (see below).

[A,B,r,U,V] = canoncorr(X,Y) also returns the canonical variables, scores. U and V are n-by-d matrices computed as

U = (X-repmat(mean(X),N,1))*A
V = (Y-repmat(mean(Y),N,1))*B

[A,B,r,U,V,stats] = canoncorr(X,Y) also returns a structure stats containing information relating to the sequence of d null hypotheses H0(k), that the (k+1)st through dth correlations are all zero, for k = 0:(d-1). stats contains seven fields, each a 1-by-d vector with elements corresponding to the values of k, as described in the following table:

FieldDescription
Wilks

Wilks' lambda (likelihood ratio) statistic

df1

Degrees of freedom for the chi-squared statistic, and the numerator degrees of freedom for the F statistic

df2

Denominator degrees of freedom for the F statistic

F

Rao's approximate F statistic for H0(k)

pF

Right-tail significance level for F

chisq

Bartlett's approximate chi-squared statistic for H0(k) with Lawley's modification

pChisq

Right-tail significance level for chisq

stats has two other fields (dfe and p) which are equal to df1 and pChisq, respectively, and exist for historical reasons.

Examples

expand all

Compute Sample Canonical Correlation

Load the sample data.

load carbig;
X = [Displacement Horsepower Weight Acceleration MPG];
nans = sum(isnan(X),2) > 0;

Compute the sample canonical correlation.

[A B r U V] = canoncorr(X(~nans,1:3),X(~nans,4:5));

Plot the canonical variables scores.

plot(U(:,1),V(:,1),'.')
xlabel('0.0025*Disp+0.020*HP-0.000025*Wgt')
ylabel('-0.17*Accel-0.092*MPG')

References

[1] Krzanowski, W. J. Principles of Multivariate Analysis: A User's Perspective. New York: Oxford University Press, 1988.

[2] Seber, G. A. F. Multivariate Observations. Hoboken, NJ: John Wiley & Sons, Inc., 1984.

See Also

|

Was this topic helpful?