I performed a ranksum test on two vectors of 80 and 88 entries, both with 0 median and in all respects fairly similar. I assumed ranksum would tell me the difference between the two vectors was insignificant but surprisingly ranksum returned a p < 0.05. I started playing around to try and understand the output better and came across the following puzzling behavior of ranksum:
As I added an identical number of 0's(5,10,20,50..) to the end of both vectors and redid the ranksum test, the p-value it output became smaller. The more 0's I added to both vectors the smaller the p-value I received upon testing. This seemed strange to me because by adding identical entries to both vectors all sample statistics should converge, right? And the more similar the sample statistics the more likely they were drawn from the same distribution?
I have been reading quite a lot about the Wilcoxon Rank Sum test but have not come across an explanation for this behavior. I'm not a statistician and I'm getting at the end of my wits here. If anybody could tell me what I'm missing it would be greatly appreciated!