Why is the pvalue 0

4 views (last 30 days)
M03
M03 on 11 Nov 2018
Answered: Jeff Miller on 11 Nov 2018
Hi!
My task is to generate random data which are correlated and their p-value is about 5%. I wrote the following code. However, my pvalue is still 0.0000. Anyone who knows what is the mistake here, please??
x1 = randn(1, 100); %first dataset of rnd numbers
rnd_gen = normrnd(.5,.5,[1,100]); %random set of numbers
noise = x1 .* rnd_gen %slight noise which should break r=1 between x1 and x2
x2 = x1 + noise %second dataset slightly changed
[R,P] = corrcoef(x1,x2) %defining p value
scatter(x1, x2)

Answers (1)

Jeff Miller
Jeff Miller on 11 Nov 2018
There are a few problems. The biggest one is that the p value depends on the observed correlation value, and this will vary randomly from sample to sample. To get a p value of 0.05 with a sample of 100, you need to get an observed correlation of about r=0.197 (the .05 critical r value from a table of critical r's).
Now it may feel like you should be able to generate random data to give you nearly this target r=0.197 most of the time, but that is an illusion. Actually, you will get r=0.197 more often by generating totally independent x1 and x2 than by generating them with any correlation that you try to induce by adding some noise.
One option is to generate random x1 and x2 repeatedly, check the correlation each time, and stop when it is as close to r=0.197 as you need (or the p is as close to .05 as you need, which is the same thing).
The other option is to generate x1 and x2 randomly and then transform them to give the correlation of r=0.197 that you really want. Have a look at this https://au.mathworks.com/matlabcentral/answers/231480-how-to-generate-random-numbers-correlated-to-a-given-dataset-in-matlab thread for information about how to do that.
HTH.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!