Code covered by the BSD License  

Highlights from
AnDarksamtest

4.5

4.5 | 2 ratings Rate this file 9 Downloads (last 30 days) File Size: 7.02 KB File ID: #17451

AnDarksamtest

by Antonio Trujillo-Ortiz

 

07 Nov 2007 (Updated 26 Dec 2007)

Anderson-Darling k-sample procedure to test whether k sampled populations are identical.

| Watch this File

File Information
Description

Anderson and Darling (1952, 1954) introduced a goodness-of-fit statistic to test the hypothesis that a random sample comes from a continuous population with a specified distribution function. It is a modification of the Kolmogorov-Smirnov (K-S) test and gives more weight to the tails than the K-S test.

The corresponding two-sample version was proposed by Darling (1957) and studied in detail by Pettitt (1976).

The Anderson-Darling k-sample test was introduced by Scholz and Stephens (1987) as a generalization of the two-sample Anderson-Darling test. It is a nonparametric statistical procedure, i.e., a rank test, and, thus, requires no assumptions other than that the samples are true independent random samples from their respective continuous populations (although provisions for tied observations are made). It tests the hypothesis that the populations from which two or more independent samples of data were drawn are identical. This test can be used to decide whether data from different sources may be combined, because they are judged to come from one common distribution, i.e., the null hypothesis Ho of same population distributions cannot be rejected. In its opposite use, it can be seen as a generalization of a one-way ANOVA for which the k-sample Kruskal-Wallis test (1952, 1953) is the most commonly used rank test.

It is an omnibus test because of its effectiveness against all alternatives to the null hypothesis Ho's (all k populations being equal). For example, it is effective for changes in scale while locations are matched, which is a weakness of the Kruskal-Wallis test.

The Anderson-Darling k-sample procedure assumes that i-th sample has a continuous distribution function and we are interested in testing the null hypothesis that all sampled populations have the same distribution without specifying the nature of that common distribution.

The observed k-sample Anderson-Darling statistic (ADK) is standardized using its exact sample mean and standard deviation to remove some of its dependence on the sample size. We note another mathematical expressions found in the literature, as MIL-HDBK-17-1E (1997).

The approximate P-value of the observed ADK statistic can be calculated using a spline interpolation method. For the interested users, we are also including, as a comment, the mathematical procedure to get the ADK critical value.

We give the Anderson-Darling k-sample procedure with and without adjustment for ties.

Finally, we compare the P-value with the desired significance level alpha to facilitate a decision about the null hypothesis Ho.

Syntax: function AnDarksamtest(X,alpha)
     
Inputs:
X - data matrix (Size of matrix must be n-by-2; data=column 1,
sample=column 2)
alpha - significance level (default = 0.05)

Output:
- Complete Anderson-Darling k-sample test

Required Products Statistics Toolbox
MATLAB release MATLAB 7 (R14)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (2)
09 Nov 2007 Michael Singer

This fills a major gap. Thanks!

12 Dec 2007 Tom Davidson

Excellent, thank you for posting this!

A couple requests: It would be great if the calculated values were returned in the output of the function (and if the text output could be suppressed).

Also, I find it convenient to pass in a cell array of samples, rather than the n x 2 array currently required. A quick patch to allow this is below:

Replace the lines:

  X1 = X(:,1); %data vector
  X2 = X(:,2); %grouping vector

With:

if iscell(X),
  X1 = [];
  X2 = [];
  for k = 1:numel(X),
    Xk = X{k};
    X1 = vertcat(X1, Xk(:));
    X2 = vertcat(X2, repmat(k, numel(Xk),1));
  end
else
  X1 = X(:,1); %data vector
  X2 = X(:,2); %grouping vector
end

Please login to add a comment or rating.
Updates
08 Nov 2007

It was added an appropriate format to cite this file.

08 Nov 2007

Summary was improved.

26 Dec 2007

Text was improved according to the Fritz Scholz and Michael Stephens'valuable suggestions.

Tag Activity for this File
Tag Applied By Date/Time
statistics Antonio Trujillo-Ortiz 22 Oct 2008 09:34:14
probability Antonio Trujillo-Ortiz 22 Oct 2008 09:34:14
andersondarling Antonio Trujillo-Ortiz 22 Oct 2008 09:34:14
ksample test Antonio Trujillo-Ortiz 22 Oct 2008 09:34:14
rank test Antonio Trujillo-Ortiz 22 Oct 2008 09:34:14
nonparametric Antonio Trujillo-Ortiz 22 Oct 2008 09:34:14

Contact us at files@mathworks.com