Xcorr in a concrete situation: normalized/not (and how is it better?), biased/unbiased?
24 views (last 30 days)
Show older comments
Hello, I have 2 data sets, one is binned and another is not, 9000 values every. Something like this:
a = [23 45 88 9 0 29 30 60 0 2 10 100 0 60 90]
b = [0 0 0 1 1 0 0 0 0 0 0 1 0 0 1]
Actually, the "b" is the set of events (yes/now) which occurs during a continuous change of the intensity of a "signal" (the signal has values [0-100]). I performed the cross-correlation, because I want to find if the occurance of the events has something in common with the changes of a signal.
So I did:
maxlags = 300;
[c, lags] = xcorr(a-mean(a), b-mean(b), maxlags, 'coeff')
And also created a 10-times shuffled null-hypothesis:urge_shuffled=urge_new;
a_sh = a;
b_sh=b;
all_corr_rand_norm=[];
for k=1:10
a_sh1 = a_shuff(randperm(length(a_sh)));
a_sh = a_sh11;
b_sh1 =b_sh(randperm(length(b_sh)));
b_sh=b_sh1;
[c_sh,lags]=xcorr(a_sh-mean(a_sh),b_sh-mean(b_sh),maxlags, 'coeff');
all_corr_rand_norm (k,:) =c_sh;
end
corr_rand=mean(all_corr_rand_norm);
- and also calculated the t-score values.
So finally I've got such a thing:
(sorry, can't upload directly this image as somehow, while editing, overpassed the limits for uploads for today)
At first I tried to do the correlation just with xcorr(a,b, maxlags, coeff), without substracting the mean, but in this case it gives me a very strange null-hypothesis (it's in red, the real correlation - in blue):
or, with bigger maxlags:
- it's clearly a triangle. I did some research in intrnet and even found some similar questions but the answers were, that these is because the "zero-padding" and that's why we should subtract the mean before the correlation.
Here is my 1st question: why in some cases people get a nice null-hypothesis even with the xcorr(a,b) - what it depends on?
The second question is, if it's right in my case to subtract the mean - basically, it already corresponds to the xcov function of Matlab, and this is already the covariance... (I also have a colleague who suggested me to do the xcorr and afterwards to substract the mean [c, lags]= xcorr(a,b, maxlags, 'coeff') x_crr=c-mean(c) -but I don't really see why I should do it and, besides, it still gives me the same strange triangle-shaped null-hypothesis.)
The 3rd question is: are the cross-covariance and cross-correlation so different in my case? Yes, I've already read all that I could find about them both, but the phrases "the covariance shows how 2 functions variate together when the cross-correlation shows their correlation" didn't help too much. If something varies with a specific lag before/after the other function, isn't it the correlation???
Also, while searching for the explanation of my triangulated null-hypothesis, I saw somebody's advice to use the "unbiased" cross-correlation", while others strongly suggested to not do this. So the 4th question is, when we finally use this unbiased/biased xcorr???! Because I've found again mostly "great" descriptions, that the "biased cross-correlation is biased, while unbiased is not". There were some better answers, of course, but I still don't get it.
And, the last question: in my case, as I have a t-significant but a weak-correlated relation between the both sets of values, will it be true to say, that we can affirm that the fluctuations of both sets of data are almost independent?
I perfectly understand, that it's a mix of common and, probably, easy questions about the cross-correlation, but the explanation that I’ve found in internet usually sound dubiously or are not really the explanations, but just some fast constatations (like "the unbiased means it's unbiased") and I feel unsure about interpreting my findings.
Thank you in advance.
0 Comments
Answers (0)
See Also
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!