Xcorr in a concrete situation: normalized/not (and how is it better?), biased/unbiased?

24 views (last 30 days)
Hello, I have 2 data sets, one is binned and another is not, 9000 values every. Something like this:
a = [23 45 88 9 0 29 30 60 0 2 10 100 0 60 90]
b = [0 0 0 1 1 0 0 0 0 0 0 1 0 0 1]
Actually, the "b" is the set of events (yes/now) which occurs during a continuous change of the intensity of a "signal" (the signal has values [0-100]). I performed the cross-correlation, because I want to find if the occurance of the events has something in common with the changes of a signal.
So I did:
maxlags = 300;
[c, lags] = xcorr(a-mean(a), b-mean(b), maxlags, 'coeff')
And also created a 10-times shuffled null-hypothesis:urge_shuffled=urge_new;
a_sh = a;
b_sh=b;
all_corr_rand_norm=[];
for k=1:10
a_sh1 = a_shuff(randperm(length(a_sh)));
a_sh = a_sh11;
b_sh1 =b_sh(randperm(length(b_sh)));
b_sh=b_sh1;
[c_sh,lags]=xcorr(a_sh-mean(a_sh),b_sh-mean(b_sh),maxlags, 'coeff');
all_corr_rand_norm (k,:) =c_sh;
end
corr_rand=mean(all_corr_rand_norm);
- and also calculated the t-score values.
So finally I've got such a thing:
(sorry, can't upload directly this image as somehow, while editing, overpassed the limits for uploads for today)
At first I tried to do the correlation just with xcorr(a,b, maxlags, coeff), without substracting the mean, but in this case it gives me a very strange null-hypothesis (it's in red, the real correlation - in blue):
or, with bigger maxlags:
- it's clearly a triangle. I did some research in intrnet and even found some similar questions but the answers were, that these is because the "zero-padding" and that's why we should subtract the mean before the correlation.
Here is my 1st question: why in some cases people get a nice null-hypothesis even with the xcorr(a,b) - what it depends on?
The second question is, if it's right in my case to subtract the mean - basically, it already corresponds to the xcov function of Matlab, and this is already the covariance... (I also have a colleague who suggested me to do the xcorr and afterwards to substract the mean [c, lags]= xcorr(a,b, maxlags, 'coeff') x_crr=c-mean(c) -but I don't really see why I should do it and, besides, it still gives me the same strange triangle-shaped null-hypothesis.)
The 3rd question is: are the cross-covariance and cross-correlation so different in my case? Yes, I've already read all that I could find about them both, but the phrases "the covariance shows how 2 functions variate together when the cross-correlation shows their correlation" didn't help too much. If something varies with a specific lag before/after the other function, isn't it the correlation???
Also, while searching for the explanation of my triangulated null-hypothesis, I saw somebody's advice to use the "unbiased" cross-correlation", while others strongly suggested to not do this. So the 4th question is, when we finally use this unbiased/biased xcorr???! Because I've found again mostly "great" descriptions, that the "biased cross-correlation is biased, while unbiased is not". There were some better answers, of course, but I still don't get it.
And, the last question: in my case, as I have a t-significant but a weak-correlated relation between the both sets of values, will it be true to say, that we can affirm that the fluctuations of both sets of data are almost independent?
I perfectly understand, that it's a mix of common and, probably, easy questions about the cross-correlation, but the explanation that I’ve found in internet usually sound dubiously or are not really the explanations, but just some fast constatations (like "the unbiased means it's unbiased") and I feel unsure about interpreting my findings.
Thank you in advance.

Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!