Xcorr in a concrete situation: normalized/not (and how is it better?), biased/unbiased?

Question

Valeria on 30 Jun 2014

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/139537-xcorr-in-a-concrete-situation-normalized-not-and-how-is-it-better-biased-unbiased

Edited: Star Strider on 30 Jun 2014

Hello, I have 2 data sets, one is binned and another is not, 9000 values every. Something like this:

a = [23 45 88 9 0 29 30 60 0 2 10 100 0 60 90]
b = [0 0 0 1 1 0 0 0 0 0 0 1 0 0 1]

Actually, the "b" is the set of events (yes/now) which occurs during a continuous change of the intensity of a "signal" (the signal has values [0-100]). I performed the cross-correlation, because I want to find if the occurance of the events has something in common with the changes of a signal.

So I did:

maxlags = 300;
[c, lags] = xcorr(a-mean(a), b-mean(b), maxlags, 'coeff')

And also created a 10-times shuffled null-hypothesis:urge_shuffled=urge_new;

a_sh = a;
b_sh=b;
all_corr_rand_norm=[];
for k=1:10
    a_sh1 = a_shuff(randperm(length(a_sh)));
    a_sh = a_sh11;
    b_sh1 =b_sh(randperm(length(b_sh)));
    b_sh=b_sh1;
    [c_sh,lags]=xcorr(a_sh-mean(a_sh),b_sh-mean(b_sh),maxlags, 'coeff');
    all_corr_rand_norm (k,:) =c_sh;
end
    corr_rand=mean(all_corr_rand_norm);

- and also calculated the t-score values.

So finally I've got such a thing:

http://postimg.org/image/m5rddi5vh/

(sorry, can't upload directly this image as somehow, while editing, overpassed the limits for uploads for today)

At first I tried to do the correlation just with xcorr(a,b, maxlags, coeff), without substracting the mean, but in this case it gives me a very strange null-hypothesis (it's in red, the real correlation - in blue):

or, with bigger maxlags:

- it's clearly a triangle. I did some research in intrnet and even found some similar questions but the answers were, that these is because the "zero-padding" and that's why we should subtract the mean before the correlation.

Here is my 1st question: why in some cases people get a nice null-hypothesis even with the xcorr(a,b) - what it depends on?

The second question is, if it's right in my case to subtract the mean - basically, it already corresponds to the xcov function of Matlab, and this is already the covariance... (I also have a colleague who suggested me to do the xcorr and afterwards to substract the mean [c, lags]= xcorr(a,b, maxlags, 'coeff') x_crr=c-mean(c) -but I don't really see why I should do it and, besides, it still gives me the same strange triangle-shaped null-hypothesis.)

The 3rd question is: are the cross-covariance and cross-correlation so different in my case? Yes, I've already read all that I could find about them both, but the phrases "the covariance shows how 2 functions variate together when the cross-correlation shows their correlation" didn't help too much. If something varies with a specific lag before/after the other function, isn't it the correlation???

Also, while searching for the explanation of my triangulated null-hypothesis, I saw somebody's advice to use the "unbiased" cross-correlation", while others strongly suggested to not do this. So the 4th question is, when we finally use this unbiased/biased xcorr???! Because I've found again mostly "great" descriptions, that the "biased cross-correlation is biased, while unbiased is not". There were some better answers, of course, but I still don't get it.

And, the last question: in my case, as I have a t-significant but a weak-correlated relation between the both sets of values, will it be true to say, that we can affirm that the fluctuations of both sets of data are almost independent?

I perfectly understand, that it's a mix of common and, probably, easy questions about the cross-correlation, but the explanation that I’ve found in internet usually sound dubiously or are not really the explanations, but just some fast constatations (like "the unbiased means it's unbiased") and I feel unsure about interpreting my findings.

Thank you in advance.