Fisher's exact test is a statistical test used to determine if there are nonrandom associations between two categorical variables.
The job of Fisher's exact test with 2 x 2 or 2x 3 contingency table is already easily done by others. However, the one with n x m contingency table hasn't found , or with bad computation.
This function efficiently deals with Fisher's exact test with n x m contingency table.
As for the function :
[ Sig,PValue,ContigenMatrix ] = FisherExactTest( XVector,YVector )
Input : the data of two variables X,Y as XVector and YVector
Or you can just input the contingency table.
Output: "Sig" returns 1 if X and Y associate,otherwise 0
"PValue" returns the computed p-value
"ContigenMatrix" returns the n x m contingency table
Please start with "Controlcenter.m", there are two simple example for explanation. If you open the algorithm, clear description of the algorithm will be illustrated.
If there is anything wrong, or your personal need, please let me know, I will help you as soon as possible :-)
Please address the p-value issue as questioned in earlier posts, otherwise this test is useless.
As has been addressed before, this script does not calculate the p-value. Please see Metha and Patel (1986) for improving your code.
You have not addressed the comment from Mike above, which is crucial. If Mike is correct, which to me appears to be the case, your function does not compute the Pvalue, but the Pcutoff (equation (2) in the mathworld link). Please address this issue as this is not a minor distinction at all.
You are the best, Guangdi!
Thanks for report, Karin.
The infinite problem is solved and a new version based on matlab-R is provided(for those who are familiar with fisher test in R).
Hello Guangdi, there seems to be a bug on this function, if you try, for example, with these two vectors, the function enters into an infinite loop:
X = [1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,2,1,2,1,2,1,1,1,1,1,1,1,2,2,1,1,1,2,1,1,2,1,1,1,2,2,2,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1];
Y = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,2,2,2,2,1,2,2,2,1,2,2,1,1,2,2,2,2,1,2,1,2,1,2,2,1,1,2,2,2,2,2,1,2,2,2,2,1,2,2,1,1,2,2,2,1,2,2,1];
I think you only implemented equation (2) from the link you posted. If you read what's written on that page after equation (2), you will find this:
"Now find all possible matrices of nonnegative integers consistent with the row and column sums Ri and Cj. For each one, calculate the associated conditional probability using (2), where the sum of these probabilities must be 1.
To compute the P-value of the test, the tables must then be ordered by some criterion that measures dependence, and those tables that represent equal or greater deviation from independence than the observed table are the ones whose probabilities are added together."
I only did the programming based on the mathworld website. Indeed, there may be some fast and efficient way for n * m table, if so, please let me know the reference for further improvement.Thanks for comments.
This doesn't correctly compute a fisher exact test p-value. It only computes the probability of exactly getting the observed contingency table. To compute a p-value, one must compute the probability of obtaining the observed results *OR SOMETHING MORE EXTREME*. In this case, one has to sum the probabilities over all possible "More extreme" contingency tables; this sum is the value of the Fisher Exact Test.
Thanks for Dubuis's scrutiny. Indeed, If Sig=1, then variables in XVector and YVector are significantly associate. Concerning Nick's question about help section, i will learn how to do it soon. For the 2nd question about two sides p-value, I am sorry that I wrote the code learning from the Wolfram website. If you can show me the equations for two side, I would be very happy to improve the code. Thanks for your comments.
The help section of the function code should be prepared properly to show the usage (including inputs and outputs) when you type:
It should not be necessary to inspect the code, or the code in a secondary function (i.e ControlCentor.m [sic]).
Secondly, the functionality seems correct but somewhat incomplete.
When I use the example from the Wolfram website, it gives the same result as the website and the other two Fisher test functions on FileExchange (myfisher and Fisherextest). However it only gives one sided p-value, not both tails, which are provided by those other two functions.
I think there just a mistake in the comments:
% If Sig=1, then variables in XVector and YVector are independent.
% If Sig=0, then variables in XVector and YVector are dependent.
Wouldn't it be the opposite since a significant p-value will get to (as said in Wolfram Mathworld) a "statistically significant association" ?
The function works very well, and the example provided was really helpful!
infinite problem is fixed and a new version based on R is also provided
correct the description
make the function simpler than last version.
Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.