File Exchange

image thumbnail

MyFisher22

version 1.17 (4.54 KB) by

A very compact routine for Fisher's exact test on 2x2 matrix; power and sample sizes calculation

3.33333
3 Ratings

6 Downloads

Updated

View License

Fisher's exact test of 2x2 contingency tables permits calculation of precise probabilities in situation where, as a consequence of small cell frequencies, the much more rapid normal approximation and chi-square calculations are liable to be inaccurate.
The Fisher's exact test involves the computations of several factorials to obtain the probability of the observed and each of the more extreme tables. Factorials growth quickly, so it's necessary use logarithms of factorials. This computations is very easy in Matlab because x!=gamma(x+1) and log(x!)=gammaln(x+1).
I rewrote this function several times: now the fully vectorization, the preallocation, the using of a recursive relationship for the Fisher's exact test on 2x2 matrix and the using of logarithm greatly speed up the execution.
It is faster than the previously submitted Fisherextest. In fact, I performed this test comparing the core of both scripts (deleting the input error check, the code to display results and compute the power). X=[70 30; 29 80] (100 tables to evaluate)
times=zeros(1,1000); for I=1:1000, tic; myfisher22(X); times(I)=toc; end, median(times)

ans =
1.3000e-4

The same for Fisherextest
ans =
0.0024

So my function in about 18.5 fold faster

Actually, the function also computes the mid-P correction to make the test less conservative.
Moreover, the routine computes the Power and, if necessary, the sample sizes needed to achieve a power=0.80 using a modified asymptotic normal method with continuity correction as described by Hardeo Sahai and Anwer Khurshid in Statistics in Medicine, 1996, Vol. 15, Issue 1: 1-21.

More other details on code are available on: http://www.advancedmcode.org/myfisher22.html

You can visit my homepage http://home.tele2.it/cardillo
My profile on XING http://www.xing.com/go/invita/13675097
My profile on LinkedIN http://it.linkedin.com/in/giuseppecardillo

Comments and Ratings (6)

Andrea Libri

Giuseppe Cardillo

The result is the same but the computation is different. Using binomial coefficients permits to easily upgrade 2x2 matrix into 2xC matrix (as in my function MyFisher23) or 3x3 matrix. I think that is not important to know if MyFisher is better or worse than Fisherextest; I think that is important to know if it is a good code, if it could be improved and if it is useful for developing other codes.

J.H. McAnally

To be honest. I Really do not see any new contribution to the previously m-file created Fisherextest.

Giuseppe Cardillo

In the beginning, after reading the review, I used MyFisher and Fisherextest by Antonio Trujillo Ortiz on the matrix given in the review and I found the bug (the first one). So, I uploaded the new file. Then, using both functions on several matrix, I saw that p-both was always the same but sometimes p-left and p-right were swapped. In this case, I was unable to find quickly the bug so I preferred to remove the file and to study the articles again. Finally, I understood where the second bug was, I fixed it and submitted the file again (the same bug was in MyFisher23, so I fixed it too).
These were the two bugs:
1) The first one: for a 2x2 matrix there is only one degree of freedom. The vectors p and q represent all the possible tables for the given marginal totals. Binomial coefficients are computed using these two vectors and the columns marginal total and then the coefficients are cross-multiplyed (vector np). Previously the function didn't correctly identified the observed table and then the p-value was totally incorrect.
2) The second one: there is a direction of the more extreme tables! Two status are possible. the first one: more estreme tables (in the same direction)-observed table-less exteme tables-more extreme table (in the opposite direction); the second one: more extreme table (in the opposite direction)-less exteme tables-observed table-more estreme tables (in the same direction). I fixed this bug using a logical array (the p-array computed after the ob variable).
That's all.

L. Aris

I saw you have removed this file after a giving deep reviewing and low rating and submitted again. Well, what is the improvement given to this one?

Updates

1.17

Little improvement suggested by Dr Lee Baker
Centre for Oncology and Molecular Medicine
University of Dundee

1.10

The MatLab BAR function doesn't work properly to display the bars that I want, so I set-up ad-hoc subroutine using the FILL function.

1.9

I added the possibility to plot the Wald statistics plot

1.8

Changes in description

1.7

Actually, the function also computes the mid-P correction to make the test less conservative.

1.6

Bug in np(1) computation was fixed. Another slight improvement in times computation (it is 1.0231 fold faster than my previous submission)

1.5

Some changes in commentary lines. Changes in Description

1.4

little improvements in tables enumeration

1.3

The fuction is now speeded up using recursion to compute all p-values

1.1

Changes in help section

NORMINV and NORMCDF was replaced by ERFCINV and ERFC so Statistics Toolbox is no more needed.

NORMINV and NORMCDF was replaced by ERFCINV and ERFC so Statistics Toolbox is no more needed.

Fully vectorized version using gammaln function

Added Power and Sample sizes calculation

MATLAB Release
MATLAB 7.7 (R2008b)

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video