Code covered by the BSD License  

Highlights from
AnDarexptest

from AnDarexptest by Antonio Trujillo-Ortiz
Anderson-Darling test for assessing exponential distribution of a sample data.

AnDarexptest(x,alpha)
function [AnDarexptest] = AnDarexptest(x,alpha)
%ANDAREXPTEST Anderson-Darling test for assessing exponential distribution of
% a sample data.
% The Anderson-Darling test (Anderson and Darling, 1952) is used to test if 
% a sample of data comes from a specific distribution. It is a modification
% of the Kolmogorov-Smirnov (K-S) test and gives more weight to the tails
% than the K-S test. The K-S test is distribution free in the sense that the
% critical values do not depend on the specific distribution being tested.
% The Anderson-Darling test makes use of the specific distribution in calculating
% critical values. This has the advantage of allowing a more sensitive test
% and the disadvantage that critical values must be calculated for each
% distribution.
% The Anderson-Darling test is only available for a few specific distributions.
% The test is calculated as: 
%              
%        AD2 = integral{[F_o(x)-F_t(x)]^2/[F_t(x)(1-F_t(x)0]}dF_t(x)
%
%        AD2a = AD2*a
%
% Note that for a given distribution, the Anderson-Darling statistic may be
% multiplied by a constant, a (which usually depends on the sample size, n).
% These constants are given in the various papers by Stephens (1974, 1977a,
% 1977b, 1979, 1986). This is what should be compared against the critical 
% values. Also, be aware that different constants (and therefore critical 
% values) have been published. You just need to be aware of what constant 
% was used for a given set of critical values (the needed constant is typically
% given with the critical values). 
% The critical values for the Anderson-Darling test are dependent on the 
% specific distribution that is being tested. Tabulated values and formulas
% have been published for a few specific distributions (normal, lognormal, 
% exponential, Weibull, logistic, extreme value type 1). The test is a one-sided
% test and the hypothesis that the distribution is of a specific form is 
% rejected if the test statistic, AD2a, is greater than the critical value. 
% Here we develop the m-file for detecting departure from an exponential
% distribution. It is one of the most powerful statistics for test this. For the
% null hypothesis testing, we provide the exact P-value formulation.
%
% Syntax: function AnDarexptest(x,alpha) 
%      
%     Input:
%          x - data vector
%      alpha - significance level (default = 0.05)
%
%     Output:
%            - Complete Anderson-Darling test for an exponential distribution
%
% Example: The data on the table below represent the days between homicides
% in Waco, Texas in 1989 as reported in Kittlitz (1999). *Two homicides occurred
% on June 16 and were defined to be 12 hours apart. It is suggested that the 
% data may follow some exponential distribution. We are interested to test if
% the days between homicides follow an exponential distribution.
%    
%     -----------------------------------------------------------------------
%     Month/Date Days Between Month/Date Days Between Month/Date Days Between
%     -----------------------------------------------------------------------
%       Jan 20        -         Jun 16       9.25*      Sep 24        2
%       Feb 23       34         Jun 16       0.50*      Oct  1        7
%       Feb 25        2         Jun 22       5.25*      Oct  4        3
%       Mar  5        8         Jun 25          3       Oct  8        4
%       Mar 10        5         Jul  6         11       Oct 19       11
%       Apr  4       25         Jul  8          2       Nov  2       14
%       May  7       33         Jul  9          1       Nov 25       23
%       May 24       17         Jul 26         17       Dec 28       33
%       May 28        4         Sep  9         45       Dec 29        1
%       Jun  7       10         Sep 22         13
%     -----------------------------------------------------------------------
%
% Data vector is:
%  x=[34 2 8 5 25 33 17 4 10 9.25 0.5 5.25 3 11 2 1 17 45 13 2 7 3 4 11 14
%  23 33 1];
%
% Calling on Matlab the function: 
%            AnDarexptest(x)
%
% Answer is:
%
% Sample size: 28
% Anderson-Darling statistic: 0.2268
% Anderson-Darling adjusted statistic: 0.2293
% Probability associated to the Anderson-Darling statistic = 0.8929
% With a given significance = 0.050
% The sampled population has an exponential distribution.
% Thus, this sample have been drawn from an exponential population with parameter = 12.2500
%
% Created by A. Trujillo-Ortiz, R. Hernandez-Walls, K. Barba-Rojo, 
%             A. Castro-Perez and B.E. Lavaniegos-Espejo*
%             Facultad de Ciencias Marinas
%             Universidad Autonoma de Baja California
%             Apdo. Postal 453
%             *Centro de Investigacion Cientifica y de Educacion Superior
%             de Ensenada 
%             Ensenada, Baja California
%             Mexico.
%             atrujo@uabc.mx
%
% Copyright. July 13, 2007.
%
% To cite this file, this would be an appropriate format:
% Trujillo-Ortiz, A., R. Hernandez-Walls, K. Barba-Rojo, A. Castro-Perez and
% B.E. Lavaniegos-Espejo. (2007). AnDarexptest:Anderson-Darling test for assessing
% exponential distribution of a sample data. A MATLAB file. [WWW document]. 
% URL http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=15746
%
% References:
%  Anderson, T. W. and Darling, D. A. (1952), Asymptotic theory of certain 
%      'goodness-of-fit' criteria based on stochastic processes. Annals of
%      Mathematical Statistics, 23:193-212. 
%  Kittlitz, Jr., R. G. (1999), Transforming the Exponential for SPC 
%      Applications. Journal of Quality Technology 31:301-308.
%  Stephens, M. A. (1974), EDF Statistics for goodness of fit and some 
%      comparisons. Journal of the American Statistical Association, 
%      69:730-737.
%  Stephens, M. A. (1976), Asymptotic Results for goodness-of-fit statistics
%      with unknown parameters. Annals of Statistics, 4:357-369.
%  Stephens, M. A. (1977a), Goodness of fit for the extreme value distribution.
%      Biometrika, 64:583-588.
%  Stephens, M. A. (1977b), Goodness of fit with special reference to tests
%      for exponentiality. Technical Report No. 262, Department of Statistics,
%      Stanford University, Stanford, CA.
%  Stephens, M. A. (1979), Tests of fit for the logistic distribution based
%      on the empirical distribution function. Biometrika, 66:591-595.
%  Stephens, M. A. (1986), Tests based on EDF statistics. In: D'Agostino,
%      R.B. and Stephens, M.A., eds.: Goodness-of-Fit Techniques. Marcel 
%      Dekker, New York. 
%

if nargin < 2,
   alpha = 0.05;
end 

if (alpha <= 0 | alpha >= 1)
   fprintf('Warning: significance level must be between 0 and 1\n');
   return;
end

if nargin < 1, 
   error('Requires at least one input argument.');
   return;
end;

n = length(x);

if n < 7,
    disp('Sample size must be greater than 7.');
    return,
else
    x = x(:);
    x = sort(x);
    p = mean(x);  %Exponential parameter estimation
    fx = 1 - exp(-x./p);

    i = 1:n;
    
    S = sum((((2*i)-1)/n)*(log(fx)+log(1-fx(n+1-i))));
    AD2 = -n-S;
    
    AD2a = AD2*(1 + 0.3/n);  %correction factor for small sample sizes (adjusted) 
    
    %P-value (observed significance level probability)
    a = 1.6193162; b = 2.5964836;
    P = a*exp(-b*AD2a);
end

disp(' ')
fprintf('Sample size: %i\n', n);
fprintf('Anderson-Darling statistic: %3.4f\n', AD2);
fprintf('Anderson-Darling adjusted statistic: %3.4f\n', AD2a);
fprintf('Probability associated to the Anderson-Darling statistic = %3.4f\n', P);
fprintf('With a given significance = %3.3f\n', alpha);
if P >= alpha;
   disp('The sampled population has an exponential distribution.');
   fprintf('Thus, this sample have been drawn from an exponential population with parameter = %6.4f\n',p);
else
   disp('The sampled population does not have an exponential distribution.');
end

return,

Contact us at files@mathworks.com