Why xcorr 'coef' is used by correlation coefficients?

Question

Brian on 5 Dec 2011

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/23118-why-xcorr-coef-is-used-by-correlation-coefficients

In reviewing questions and material on xcorr, it appears to be that for autocorrelation or cross-corelation coefficients, most responses suggest using the 'coef' option in xcorr. While this does give you a value between -1 and 1, I am not sure why this option is calculated with as xcorr(a,b)/(norm(a)*norm(b)) where a and b are column vectors, in most cases I would think an unbiased correlation should be used?

In my limited understanding, it seems the correlation values should be unbiased, then normalized...

For example autocorrelation, xcorr(a,'unbiased')./var(a),

To illustrate my point, if I autocorrelate a sine function, I would expect the lagged correlation coefficient to vary between 1 and -1 every time the cycle re-alignes itself. But the 'coef' option consistently deceases the correlation coefficient with lag. I realized this is because of how it is calculated, but I don't understand why it is calculated this way? Shouldn't the unbiased approach be used?

A simple example to illustrate this question: t=0:500; n=length(t); ts=5*sin(2*pi*(t./12)); lags=-250:250; test1=xcov(ts,250,'coef'); test2=xcov(ts,250,'unbiased')./var(ts);

figure; plot(t,ts); xlabel('time'); ylabel('amplitude'); figure; plot(lags,test2,'r'); hold on; plot(lags,test1);

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Brian on 5 Dec 2011

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/23118-why-xcorr-coef-is-used-by-correlation-coefficients#answer_30378

Hey Wayne,

Thanks again for you prompt response. I understand what you are saying about how Matlab makes this calculation when the 'coef' option is used. What I am trying to better understand is whether it makes sense to say that the lagged sine function is less correlated as lags increase as the normalized output does('coef' option). As the example showed xcorr(ts,'coef') has a value .76 at lag 120, even though the lagged time series actually perfectly correlated if we compared the two time series in an unbiased way (i.e. lagging one of the two sine curves by 120 would have both shorten times overlaying each other). Shouldn't the process of normalizing the result be done in an unbiased way? Does this clarify what I am trying to get at?

Thanks again!

Brian Dz

1 Comment
Show -1 older commentsHide -1 older comments

Wayne King on 5 Dec 2011

Hi Brian, yes, it makes perfect sense. The answer is what I explained in my previous response. And to be clear, the same thing happens to the autocorrelation sequence estimate when the 'biased', 'unbiased', or 'none' options are used. The autocorrelation decays in all instances as you would expect since fewer and fewer terms enter the sum.

Sign in to comment.

Answer 2

Brian on 5 Dec 2011

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/23118-why-xcorr-coef-is-used-by-correlation-coefficients#answer_30362

Hi Wayne,

Thanks for the quick response. I understand the purpose of a normalizing the covariance, my question in regards to why the the lagged autocorrelation of a sine function is not consistently between 1 and -1 as the lag is increased despite the fact that the signal would be perfectly overlapped at regular intervals.

Is is illustrated by test1 which uses the 'coef' option.

t=0:500; n=length(t); ts=5*sin(2*pi*(t./12)); lags=-250:250; test1=xcorr(ts,250,'coef'); test2=xcorr(ts,250,'unbiased')./var(ts);

figure; plot(t,ts); xlabel('time'); ylabel('amplitude'); figure; plot(lags,test2,'r'); hold on; plot(lags,test1);

So at a lag of 120, the autocorrelation should be 1 again but the 'coef' option has the autocorrelation at 0.76. I understand how matlab calculated this, but should matlab be using an unbiased calculation, so the correlation is 1 rather then .76? Again thank you for you quick reply!

Brian

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 3

Wayne King on 5 Dec 2011

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/23118-why-xcorr-coef-is-used-by-correlation-coefficients#answer_30359

Hi Brian, 'coeff' is helpful because it gives you a convenient scale to interpret the results. It's the same reason why correlation in statistics is often more useful than covariance.

If I tell you that the maximum autocorrelation between two vectors is 4500 for example, it's hard to interpret what that means. That might mean that the two vectors are nearly perfectly correlated at that lag, or it might mean that their correlation is pretty small (near zero). That's because it all depends on the units of the input vectors. The 'coeff' option, however, makes it easier to interpret. If I tell you that the maximum correlation is 0.9, then you know there is a very strong correlation at a given lag.

To keep it in the sine wave context, note:

x = cos(pi/4*(0:99));
y = 4*cos(pi/4*(0:99)-pi/2);
[c,lags] = xcorr(y,x,10);
stem(lags,c);

Note the maximum correlation at lag 2 is 200. Again, very hard to know exactly what that means without knowing more about the signals.

But:

    x = cos(pi/4*(0:99));
    y = 4*cos(pi/4*(0:99)-pi/2);
    [c,lags] = xcorr(y,x,10,'coeff');
    stem(lags,c);

Now you see exactly what it means. The two signals are basically perfectly correlated.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 4

Wayne King on 5 Dec 2011

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/23118-why-xcorr-coef-is-used-by-correlation-coefficients#answer_30370

Hi Brian, that is because you have fewer and fewer terms that enter the autocorrelation sum as the lag increases. The normalization in the denominator is based on all the data in the sequence, as is the autocorrelation at zero lag. That is not the case as you increase the lag.

That's why with:

[c,lags] = xcorr(ts);
stem(lags,c);

You see the autocorrelation decay. You don't use a different normalization term at different lags, which you would have to do get 1s or -1s at all your periods as you suggest.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 5

Brian on 5 Dec 2011

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/23118-why-xcorr-coef-is-used-by-correlation-coefficients#answer_30381

Hi Wayne,

Thank you for you time and consideration! I appreciate your clarification on these questions!

Brian Dz

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Why xcorr 'coef' is used by correlation coefficients?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (4)

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

Why xcorr 'coef' is used by correlation coefficients?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (4)

0 Comments Show -2 older commentsHide -2 older comments

0 Comments Show -2 older commentsHide -2 older comments

0 Comments Show -2 older commentsHide -2 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments