Skip to Main Content Skip to Search
Login
File Exchange
MATLAB Newsgroup
Link Exchange
  Blogs  
 Contest 
MathWorks.com

Thread Subject: correlation matrix estimation

Subject: correlation matrix estimation

From: msmscarlatti@googlemail.com

Date: 16 May, 2008 20:33:42

Message: 1 of 4

I have written some code which calculates an EWMA estimate for the
correlation matrix of two time series. Suppose that X and Y are our
time series vectors of length N (say N daily observations of two
market interest rates) with X_1 the oldest observation and X_N the
most recent one.

First the code transforms X and Y by subtracting off the average, i.e.

X --> X - E(X)
Y --> Y - E(Y)

Then it concatenates X and Y forming a (Nx2) matrix, let's call it Z.
Then it multiplies the ith row by lambda*(N-i). The covariance matrix
is calculated by

Cov = (1-lambda) * Z' * Z

Finally, the correlation matrix is computed in the usual way by

Corr = S*Cov*S

where S is a diagonal matrix whose elements are given by 1/
sqrt(Cov_ii).

My code seems to work fine for small N, but when I try it for very
large N (several hundreds) I get nonsensical results. Have I missed
something obvious?

I include my MATLAB code below.

function [CovMat,CorrMat]=EWMACovariance(X,Y,lambda)
% COVARIANCE Computes covariances and correlations
n=size(X,1);
m=mean(X); % Compute the means
X=X-m(ones(n,1),:); % Subtract the means
m = mean(Y);
Y=Y-m(ones(n,1),:);
lambdaVector = zeros(n,1);
for i = 1:n
   lambdaVector(i) = lambda^(n-i);
end
X = [X Y];
lambdaMatrix = diag(lambdaVector);
X = lambdaMatrix*X;
CovMat=(1-lambda)*X'*X; % Compute the covariance
if nargout==2 % Compute the correlation, if requested
   s=sqrt(diag(CovMat));
   CorrMat=CovMat./(s*s');
end

Subject: Re: correlation matrix estimation

From: Roger Stafford

Date: 17 May, 2008 04:08:02

Message: 2 of 4

msmscarlatti@googlemail.com wrote in message <e963b995-e578-4c43-
b848-711ec847fdf8@c65g2000hsa.googlegroups.com>...
> I have written some code which calculates an EWMA estimate for the
> correlation matrix of two time series. Suppose that X and Y are our
> time series vectors of length N (say N daily observations of two
> market interest rates) with X_1 the oldest observation and X_N the
> most recent one.
>
> First the code transforms X and Y by subtracting off the average, i.e.
>
> X --> X - E(X)
> Y --> Y - E(Y)
>
> Then it concatenates X and Y forming a (Nx2) matrix, let's call it Z.
> Then it multiplies the ith row by lambda*(N-i). The covariance matrix
> is calculated by
>
> Cov = (1-lambda) * Z' * Z
>
> Finally, the correlation matrix is computed in the usual way by
>
> Corr = S*Cov*S
>
> where S is a diagonal matrix whose elements are given by 1/
> sqrt(Cov_ii).
>
> My code seems to work fine for small N, but when I try it for very
> large N (several hundreds) I get nonsensical results. Have I missed
> something obvious?
>
> I include my MATLAB code below.
>
> function [CovMat,CorrMat]=EWMACovariance(X,Y,lambda)
> % COVARIANCE Computes covariances and correlations
> n=size(X,1);
> m=mean(X); % Compute the means
> X=X-m(ones(n,1),:); % Subtract the means
> m = mean(Y);
> Y=Y-m(ones(n,1),:);
> lambdaVector = zeros(n,1);
> for i = 1:n
> lambdaVector(i) = lambda^(n-i);
> end
> X = [X Y];
> lambdaMatrix = diag(lambdaVector);
> X = lambdaMatrix*X;
> CovMat=(1-lambda)*X'*X; % Compute the covariance
> if nargout==2 % Compute the correlation, if requested
> s=sqrt(diag(CovMat));
> CorrMat=CovMat./(s*s');
> end
----------------
  There are a couple of aspects to your code that I question. First, it is my
understanding that in computing an exponentially-weighted moving average
(EWMA) covariance, the sum of the applied weights should be precisely 1.
This does not appear to be the case in your code. In every row [X(i),Y(i)] you
are first multiplying each of the two terms by lambda^(n-i) and then taking
the product of those two terms, which gives you lambda^(2*(n-i))*X(i)*Y(i), as
you calculate X'*X. Finally you multiply by just a single (1-lambda). If we
sum these weights for all n products we get

 (1-lambda)*(1 + lambda^2 + lambda^4 + ... + lambda^(2*(n-1)))

which is very far from 1. It looks to me as though you should either be
initially multiplying the rows of [X,Y] by sqrt(lambda^(n-i)) or else waiting
until you have computed the individual products X(i)*Y(i) before multiplying
by lambda^(n-i). (The latter seems preferable.) However, even when this
correction is made, the sum is still not precisely 1. Instead of multiplying by
(1-lambda) at the last step, you should multiply by (1-lambda)/(1-lambda^n)
to get a sum of weights of exactly 1.

  My second worry is that, to be consistent with the EWMA "decay-with-time"
philosophy, the means you have computed should be done the same way,
namely with decaying weights that add up to 1. If a member of a time series
in computing covariance is to be given a low weight because it is old, then
why shouldn't it have the same low weight in determining a mean value? In
your code you used the ordinary matlab 'mean' function which applies equal
weights in obtaining means.

Roger Stafford

Subject: Re: correlation matrix estimation

From: vontressms@cs.com

Date: 19 May, 2008 14:50:38

Message: 3 of 4

On May 16, 3:33=A0pm, msmscarla...@googlemail.com wrote:
> I have written some code which calculates an EWMA estimate for the
> correlation matrix of two time series. Suppose that X and Y are our
> time series vectors of length N (say N daily observations of two
> market interest rates) with X_1 the oldest observation and X_N the
> most recent one.
>
> First the code transforms X and Y by subtracting off the average, i.e.
>
> X --> X - E(X)
> Y --> Y - E(Y)
>
> Then it concatenates X and Y forming a (Nx2) matrix, let's call it Z.
> Then it multiplies the ith row by lambda*(N-i). The covariance matrix
> is calculated by
>
> Cov =3D (1-lambda) * Z' * Z
>
> Finally, the correlation matrix is computed in the usual way by
>
> Corr =3D S*Cov*S
>
> where S is a diagonal matrix whose elements are given by 1/
> sqrt(Cov_ii).
>
> My code seems to work fine for small N, but when I try it for very
> large N (several hundreds) I get nonsensical results. Have I missed
> something obvious?
>
> I include my MATLAB code below.
>
> function [CovMat,CorrMat]=3DEWMACovariance(X,Y,lambda)
> % COVARIANCE Computes covariances and correlations
> n=3Dsize(X,1);
> m=3Dmean(X); % Compute the means
> X=3DX-m(ones(n,1),:); % Subtract the means
> m =3D mean(Y);
> Y=3DY-m(ones(n,1),:);
> lambdaVector =3D zeros(n,1);
> for i =3D 1:n
> =A0 =A0lambdaVector(i) =3D lambda^(n-i);
> end
> X =3D [X Y];
> lambdaMatrix =3D diag(lambdaVector);
> X =3D lambdaMatrix*X;
> CovMat=3D(1-lambda)*X'*X; % Compute the covariance
> if nargout=3D=3D2 % Compute the correlation, if requested
> =A0 =A0s=3Dsqrt(diag(CovMat));
> =A0 =A0CorrMat=3DCovMat./(s*s');
> end

here is an R script to do it. you got the last bit wrong.
Mark

# generate some phoney data
require(mvtnorm)
mean <- c(1,2,3,4)
sigma <- matrix( data=3Dc(3,1,1,1,
                        1,3,1,1,
                        1,1,3,1,
                        1,1,1,3),nrow=3D4)
n<-50
cols<-4
some.data<-rmvnorm(n,mean,sigma)

#get average
avg <- rowsum(some.data, rep(1,n))/n
#center data
centered <- some.data - kronecker(rep(1,n),avg)
#get covariance matrix
cov.mat <- t(centered)%*%centered/(n-1)
#test against R function for covariance matrix
test.cov <- cov.mat-cov(some.data)
test.cov

# get diagonals of covariance matrix, take sqrt, and invert
Id<-diag(cols)
diag(Id)<-1/sqrt(diag(cov.mat))

# multiply by Id on both sides.
corr.mat <- Id%*%cov.mat%*%Id]
#test against built in function
test.cor <- corr.mat-cor(some.data)
test.cor

Subject: Re: correlation matrix estimation

From: Ray Koopman

Date: 20 May, 2008 08:51:24

Message: 4 of 4

On May 16, 1:33 pm, msmscarla...@googlemail.com wrote:
> I have written some code which calculates an EWMA estimate for the
> correlation matrix of two time series. Suppose that X and Y are our
> time series vectors of length N (say N daily observations of two
> market interest rates) with X_1 the oldest observation and X_N the
> most recent one.
>
> First the code transforms X and Y by subtracting off the average, i.e.
>
> X --> X - E(X)
> Y --> Y - E(Y)
>
> Then it concatenates X and Y forming a (Nx2) matrix, let's call it Z.
> Then it multiplies the ith row by lambda*(N-i). The covariance matrix
> is calculated by
>
> Cov = (1-lambda) * Z' * Z
>
> Finally, the correlation matrix is computed in the usual way by
>
> Corr = S*Cov*S
>
> where S is a diagonal matrix whose elements are given by 1/
> sqrt(Cov_ii).
>
> My code seems to work fine for small N, but when I try it for very
> large N (several hundreds) I get nonsensical results. Have I missed
> something obvious?
>
> I include my MATLAB code below.
>
> function [CovMat,CorrMat]=EWMACovariance(X,Y,lambda)
> % COVARIANCE Computes covariances and correlations
> n=size(X,1);
> m=mean(X); % Compute the means
> X=X-m(ones(n,1),:); % Subtract the means
> m = mean(Y);
> Y=Y-m(ones(n,1),:);
> lambdaVector = zeros(n,1);
> for i = 1:n
> lambdaVector(i) = lambda^(n-i);
> end
> X = [X Y];
> lambdaMatrix = diag(lambdaVector);
> X = lambdaMatrix*X;
> CovMat=(1-lambda)*X'*X; % Compute the covariance
> if nargout==2 % Compute the correlation, if requested
> s=sqrt(diag(CovMat));
> CorrMat=CovMat./(s*s');
> end

Let Z be an N x 3 matrix in which row i = [x_i, y_i, 1]*sqrt(w_i).
For exponential weighting, take w_i = a^(N-i), with 0 < a < 1.
Let T = Z'Z. Then sum w_i = t_33,
the weighted means are
  m_1 = t_13/t_33 and m_2 = t_23/t_33,
the weighted variances are
  s_11 = t_11/t_33 - m_1^2 and s_22 = t_22/t_33 - m_2^2,
the weighted covariance is
  s_12 = t_12/t_33 - m_1*m_2,
and the weighted correlation is
  r_12 = s_12/sqrt(s_11*s_22).

If you want to adjust the variances and covariance for
degrees of freedom, analogous to dividing by n-1 instead of n,
multiply them by n'/(n'-1), where n' = (sum w_i)^2 / sum w_i^2.
Of course, this will not change the correlation.

Tags for this Thread

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

rssFeed for this Thread

envelope graphic E-mail this page to a colleague

Public Submission Policy
NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Disclaimer prior to use.
Related Topics