Code covered by the BSD License  

Highlights from
Fisherextest

from Fisherextest by Antonio Trujillo-Ortiz
Fisher's Exact Probability Test.

[Ppos,Pneg,Pboth]=Fisherextest(a,b,c,d)
function [Ppos,Pneg,Pboth]=Fisherextest(a,b,c,d)
%[Ppos,Pneg,Pboth]=Fisherextest(a,b,c,d)- Fisher's Exact Probability Test.
%   FISHEREXTEST performs the Fisher exact probability test for a table of 
%   frequency data cross-classified according to two categorical variables, 
%   each of which has two levels or subcategories (2x2). It is a non-parametric
%   statistical test used to determine if there are nonrandom associations 
%   between the two categorical variables. Fisher's exact test is used to 
%   calculate an exact P-value with small number of expected frequencies, for
%   which the Chi-square test is not appropriate (in case the total number of
%   observations is less than 20 or there is a cell-value less than 5). This
%   test is based upon the hypergeometric probability. The test was proposed in
%   the 1934 edition of the famous Ronald Aylmer Fisher's book 'Statistical 
%   Methods for Research Workers'.
%
%   So, according to the next 2x2 table design,
%
%                    Var.1
%                --------------
%                  a       b      r1=a+b
%         Var.2
%                  c       d      r2=c+d
%                --------------
%                c1=a+c  c2=b+d  n=c1+c2
%
%   The Fisher's exact test it is conditioned to all the a+b values (all the 
%   possible cell combinations that would still result in the marginal frequencies
%   as highlighted) such that,
%
%          H(a+b) = {X: X  H and a+b = r1}.
%
%   Then,
%
%          P(a,b|a+b,n) = P(X|X  H(r1)) = b(a;c1,p)*b(b;c2,p)/b(a+b;n,p).
%
%   This binomials relationship reduces each 2x2 table to the exact hypergeometric 
%   distribution to compute the P-value. Now considering only the binomial 
%   coefficients,
%
%          P(X|X  H(r1)) = C(c1,a)*C(c2,b)/C(n,a+b)
%
%   Thereby Fisher's exact P-values are readily evaluated as,
%
%          P = SUM(P(X|X  H(r1))).        
%
%   IMPORTANT: Due that Matlab could not work for factorials greater than 170. We
%   use the function sum([log(x+1)......]), in order to avoid further calculation 
%   problems.
%
%   Syntax: function [Ppos,Pneg,Pboth]=Fisherextest(a,b,c,d) 
%
%   $$-If user does not put any output arguments, file returns a P-value table.-$$
%      
%   Inputs:
%         a,b,c,d - observed frequency cells
%
%   Output:
%         Three p-values [decide to use Left, Right or 2-Tail before
%         collecting (or looking at) the data]: 
%           - Right tail (Ppos): Use this when the alternative to independence is that there is
%             positive association between the variables. That is, the observations 
%             tend to lie in upper left and lower right. 
%           - Left tail (Pneg): Use this when the alternative to independence is that there is 
%             negative association between the variables. That is, the observations
%             tend to lie in lower left and upper right. 
%           - 2-Tail (Pboth): Use this when there is no prior alternative. 
%
%   Example: From the example given on the Ina Parks S. Howell's internet homepage 
%            (http://www.fiu.edu/~howellip/Fisher.pdf). Suppose Crane and Egret are
%            two very small collages, the results of the beginning physics course at
%            each of the two schools are given in the follow table.
%
%                                   Physics
%                              Pass         Fail
%                            ---------------------
%                      Crane     8           14
%            Collage                 
%                      Egret     1            3
%                            ---------------------
%                                       
%   Calling on Matlab the function: 
%             Fisherextest(8,14,1,3)
%
%   Answer is:
%
%   Fisher's exact test P-values for cells:
%   a = 8, b = 14, c = 1, d = 3
%   ----------------------------------------------
%                      P-value  
%   ----------------------------------------------
%     Left tail       Right tail         2-tail   
%     (negative)      (positive)         (both)  
%   ----------------------------------------------
%   0.5685618729     0.8408026756     1.0000000000
%   ----------------------------------------------
%
%  $-We thank Dr. Stan Letovsky for his valuable suggestions to improve this m-file.-$
%
%  Created by A. Trujillo-Ortiz, R. Hernandez-Walls and A. Castro-Perez
%             Facultad de Ciencias Marinas
%             Universidad Autonoma de Baja California
%             Apdo. Postal 453
%             Ensenada, Baja California
%             Mexico.
%             atrujo@uabc.mx
%             And the special collaboration of the post-graduate students of the 2004:2
%             Multivariate Statistics Course: Laura Rodriguez-Cardozo, Norma Alicia Ramos-Delgado,
%             and Rene Garcia-Sanchez. 
%  Copyright (C) September 27, 2004.
%
%  --We are deeply grateful to Tim Davies from AMBION for their valuable comments on this m file.
%  
%  To cite this file, this would be an appropriate format:
%  Trujillo-Ortiz, A., R. Hernandez-Walls, A. Castro-Perez, L. Rodriguez-Cardozo 
%    N.A. Ramos-Delgado and R. Garcia-Sanchez. (2004). Fisherextest:Fisher's Exact
%    Probability Test. A MATLAB file. [WWW document]. URL http://www.mathworks.com/
%    matlabcentral/fileexchange/loadFile.do?objectId=5957
%
%  References:
% 
%  Agresti, A. (1992), A Survey of Exact Inference for Contegency Tables. 
%           Statistical Science,7:131-153.
%  Fisher, R.A. (1934), Statistical Methods for Research Workers. Chapter 12. 
%           5th Ed., Scotland:Oliver & Boyd.
%  Howell, I.P.S. (Internet homepage), http://www.fiu.edu/~howellip/Fisher.pdf
%  Zar, J.H. (1999), Biostatistical Analysis (2nd ed.). NJ: Prentice-Hall,
%           Englewood Cliffs. p. 543-555. 
%

if nargin < 4, 
   error('You need to input four arguments.');
   return,
end;

tt =  [a b;c d];
t = [a b;c d];
pp = t(:,1)./sum(t')';

if (pp(1)>pp(2))
    t =t([2,1],:);
end;

a = t(1,1);
b = t(1,2);
c = t(2,1);
d = t(2,2);

r1 = a+b;
r2 = c+d;
c1 = a+c;
c2 = b+d;
n = c1+c2;

ap = [0:min([r1 c1])]';
bp = r1-ap;
cp = c1-ap;
dp = r2-cp;
len = length(ap);
factor1 = sum([log(1:r1) log(1:r2) log(1:c1) log(1:c2) -log(1:n)]);

for k = 1:len
   factor2 = sum([log(1:ap(k)) log(1:bp(k)) log(1:cp(k)) log(1:dp(k))]);
   Pap(k) = factor1-factor2;
end;

Pap = exp(Pap);
Pap =round(Pap./0.00000000001)*0.00000000001;
P1 = sum(Pap(find(ap <= a)));
if P1 > 1;
   P1 = 1;
else
   P1;
end;

P2 = sum(Pap(find(ap >= a)));
if P2 > 1;
   P2 = 1;
else
   P2;
end;

factor2 = sum([log(1:a) log(1:b) log(1:c) log(1:d)]);

Pa = exp(factor1-factor2);
Pa = round(Pa/0.00000000001)*0.00000000001;
P3 = sum(Pap(find(Pap <= Pa)));
if P3 > 1;
   P3 = 1;
else
   P3;
end;

a=tt(1,1);
b=tt(1,2);
c=tt(2,1);
d=tt(2,2);
    
if nargout ~= 0,
    disp(' ')
    disp('Fisher''s exact test P-values for cells:')
    disp(['a = ' num2str(a) ', b = ' num2str(b) ', c = ' num2str(c) ', d = ' num2str(d) '']);
    Ppos=P2;
    Pneg=P1;
    Pboth=P3;
else
    disp(' ')
    disp('Fisher''s exact test P-values for cells:')
    disp(['a = ' num2str(a) ', b = ' num2str(b) ', c = ' num2str(c) ', d = ' num2str(d) '']);
    fprintf('----------------------------------------------\n');
    disp('                   P-value  '); 
    fprintf('----------------------------------------------\n');
    disp('  Left tail       Right tail         2-tail  '); 
    disp('  (negative)      (positive)         (both)  '); 
    fprintf('----------------------------------------------\n');
    fprintf('%10.10f     %10.10f     %10.10f\n',[P1,P2,P3].');
    fprintf('----------------------------------------------\n');
end;
return;

Contact us at files@mathworks.com