Code covered by the BSD License  

Highlights from
MyFisher22

3.0

3.0 | 2 ratings Rate this file 11 Downloads (last 30 days) File Size: 4.54 KB File ID: #15434
image thumbnail

MyFisher22

by

 

26 Jun 2007 (Updated )

A very compact routine for Fisher's exact test on 2x2 matrix; power and sample sizes calculation

| Watch this File

File Information
Description

Fisher's exact test of 2x2 contingency tables permits calculation of precise probabilities in situation where, as a consequence of small cell frequencies, the much more rapid normal approximation and chi-square calculations are liable to be inaccurate.
The Fisher's exact test involves the computations of several factorials to obtain the probability of the observed and each of the more extreme tables. Factorials growth quickly, so it's necessary use logarithms of factorials. This computations is very easy in Matlab because x!=gamma(x+1) and log(x!)=gammaln(x+1).
I rewrote this function several times: now the fully vectorization, the preallocation, the using of a recursive relationship for the Fisher's exact test on 2x2 matrix and the using of logarithm greatly speed up the execution.
It is faster than the previously submitted Fisherextest. In fact, I performed this test comparing the core of both scripts (deleting the input error check, the code to display results and compute the power). X=[70 30; 29 80] (100 tables to evaluate)
times=zeros(1,1000); for I=1:1000, tic; myfisher22(X); times(I)=toc; end, median(times)

ans =
1.3000e-4

The same for Fisherextest
ans =
0.0024

So my function in about 18.5 fold faster

Actually, the function also computes the mid-P correction to make the test less conservative.
Moreover, the routine computes the Power and, if necessary, the sample sizes needed to achieve a power=0.80 using a modified asymptotic normal method with continuity correction as described by Hardeo Sahai and Anwer Khurshid in Statistics in Medicine, 1996, Vol. 15, Issue 1: 1-21.

More other details on code are available on: http://www.advancedmcode.org/myfisher22.html

You can visit my homepage http://home.tele2.it/cardillo
My profile on XING http://www.xing.com/go/invita/13675097
My profile on LinkedIN http://it.linkedin.com/in/giuseppecardillo

MATLAB release MATLAB 7.7 (R2008b)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (5)
23 Oct 2009 Luigi Giaccari  
02 Jul 2007 Giuseppe Cardillo

The result is the same but the computation is different. Using binomial coefficients permits to easily upgrade 2x2 matrix into 2xC matrix (as in my function MyFisher23) or 3x3 matrix. I think that is not important to know if MyFisher is better or worse than Fisherextest; I think that is important to know if it is a good code, if it could be improved and if it is useful for developing other codes.

29 Jun 2007 J.H. McAnally

To be honest. I Really do not see any new contribution to the previously m-file created Fisherextest.

28 Jun 2007 Giuseppe Cardillo

In the beginning, after reading the review, I used MyFisher and Fisherextest by Antonio Trujillo Ortiz on the matrix given in the review and I found the bug (the first one). So, I uploaded the new file. Then, using both functions on several matrix, I saw that p-both was always the same but sometimes p-left and p-right were swapped. In this case, I was unable to find quickly the bug so I preferred to remove the file and to study the articles again. Finally, I understood where the second bug was, I fixed it and submitted the file again (the same bug was in MyFisher23, so I fixed it too).
These were the two bugs:
1) The first one: for a 2x2 matrix there is only one degree of freedom. The vectors p and q represent all the possible tables for the given marginal totals. Binomial coefficients are computed using these two vectors and the columns marginal total and then the coefficients are cross-multiplyed (vector np). Previously the function didn't correctly identified the observed table and then the p-value was totally incorrect.
2) The second one: there is a direction of the more extreme tables! Two status are possible. the first one: more estreme tables (in the same direction)-observed table-less exteme tables-more extreme table (in the opposite direction); the second one: more extreme table (in the opposite direction)-less exteme tables-observed table-more estreme tables (in the same direction). I fixed this bug using a logical array (the p-array computed after the ob variable).
That's all.

26 Jun 2007 L. Aris

I saw you have removed this file after a giving deep reviewing and low rating and submitted again. Well, what is the improvement given to this one?

Updates
28 Aug 2007

Added Power and Sample sizes calculation

06 Jun 2008

Fully vectorized version using gammaln function

12 Jun 2008

NORMINV and NORMCDF was replaced by ERFCINV and ERFC so Statistics Toolbox is no more needed.

12 Jun 2008

NORMINV and NORMCDF was replaced by ERFCINV and ERFC so Statistics Toolbox is no more needed.

12 Nov 2008

Changes in help section

25 Nov 2008

The fuction is now speeded up using recursion to compute all p-values

25 Nov 2008

little improvements in tables enumeration

21 Oct 2009

Some changes in commentary lines. Changes in Description

22 Oct 2009

Bug in np(1) computation was fixed. Another slight improvement in times computation (it is 1.0231 fold faster than my previous submission)

23 Oct 2009

Actually, the function also computes the mid-P correction to make the test less conservative.

23 Dec 2009

Changes in description

22 Mar 2010

I added the possibility to plot the Wald statistics plot

23 Mar 2010

The MatLab BAR function doesn't work properly to display the bars that I want, so I set-up ad-hoc subroutine using the FILL function.

09 Jun 2010

Little improvement suggested by Dr Lee Baker
Centre for Oncology and Molecular Medicine
University of Dundee

Contact us