MATLAB Answers


Why is the pvalue 0

Asked by M03
on 11 Nov 2018
Latest activity Answered by Jeff Miller on 11 Nov 2018
My task is to generate random data which are correlated and their p-value is about 5%. I wrote the following code. However, my pvalue is still 0.0000. Anyone who knows what is the mistake here, please??
x1 = randn(1, 100); %first dataset of rnd numbers
rnd_gen = normrnd(.5,.5,[1,100]); %random set of numbers
noise = x1 .* rnd_gen %slight noise which should break r=1 between x1 and x2
x2 = x1 + noise %second dataset slightly changed
[R,P] = corrcoef(x1,x2) %defining p value
scatter(x1, x2)

  1 Comment

what value o you expect?

Sign in to comment.


1 Answer

Answer by Jeff Miller on 11 Nov 2018

There are a few problems. The biggest one is that the p value depends on the observed correlation value, and this will vary randomly from sample to sample. To get a p value of 0.05 with a sample of 100, you need to get an observed correlation of about r=0.197 (the .05 critical r value from a table of critical r's).
Now it may feel like you should be able to generate random data to give you nearly this target r=0.197 most of the time, but that is an illusion. Actually, you will get r=0.197 more often by generating totally independent x1 and x2 than by generating them with any correlation that you try to induce by adding some noise.
One option is to generate random x1 and x2 repeatedly, check the correlation each time, and stop when it is as close to r=0.197 as you need (or the p is as close to .05 as you need, which is the same thing).
The other option is to generate x1 and x2 randomly and then transform them to give the correlation of r=0.197 that you really want. Have a look at this thread for information about how to do that.


Sign in to comment.