Documentation |
Correlation coefficients
r = corrcoef(X) r = corrcoef(X,Y)
X | Matrix where each row is an observation and each column is a variable. |
Y | Matrix where each row is an observation and each column is a variable. |
corrcoef for financial time series objects is based on the MATLAB^{®} corrcoef function. See corrcoef in the MATLAB documentation.
r=corrcoef(X) calculates a matrix r of correlation coefficients for an array X, in which each row is an observation and each column is a variable.
r=corrcoef(X,Y), where X and Y are column vectors, is the same as r=corrcoef([X Y]). corrcoef converts X and Y to column vectors if they are not; that is, r = corrcoef(X,Y) is equivalent to r=corrcoef([X(:) Y(:)]) in that case.
If c is the covariance matrix, c= cov(X), then corrcoef(X) is the matrix whose (i,j) 'th element is ci,j/sqrt(ci,i*c(j,j)).
[r,p]=corrcoef(...) also returns p, a matrix of p-values for testing the hypothesis of no correlation. Each p-value is the probability of getting a correlation as large as the observed value by random chance, when the true correlation is zero. If p(i,j) is less than 0.05, then the correlation r(i,j) is significant.
[r,p,rlo,rup]=corrcoef(...) also returns matrices rlo and rup, of the same size as r, containing lower and upper bounds for a 95% confidence interval for each coefficient.
[...]=corrcoef(...,'PARAM1',VAL1,'PARAM2',VAL2,...) specifies additional parameters and their values. Valid parameters are:
'alpha' — A number between 0 and 1 to specify a confidence level of 100*(1-ALPHA)%. Default is 0.05 for 95% confidence intervals.
'rows' — Either 'all' (default) to use all rows, 'complete' to use rows with no NaN values, or 'pairwise' to compute r(i,j) using rows with no NaN values in column i or j.
The p-value is computed by transforming the correlation to create a t-statistic having N – 2 degrees of freedom, where N is the number of rows of X. The confidence bounds are based on an asymptotic normal distribution of 0.5*log((1 + r)/(1 – r)), with an approximate variance equal to 1/(N – 3). These bounds are accurate for large samples when X has a multivariate normal distribution. The 'pairwise' option can produce an r matrix that is not positive definite.