Documentation

### This is machine translation

Mouseover text to see original. Click the button below to return to the English version of the page.

# kstest2

Two-sample Kolmogorov-Smirnov test

## Syntax

``h = kstest2(x1,x2)``
``h = kstest2(x1,x2,Name,Value)``
``````[h,p] = kstest2(___)``````
``````[h,p,ks2stat] = kstest2(___)``````

## Description

example

````h = kstest2(x1,x2)` returns a test decision for the null hypothesis that the data in vectors `x1` and `x2` are from the same continuous distribution, using the two-sample Kolmogorov-Smirnov test. The alternative hypothesis is that `x1` and `x2` are from different continuous distributions. The result `h` is `1` if the test rejects the null hypothesis at the 5% significance level, and `0` otherwise.```

example

````h = kstest2(x1,x2,Name,Value)` returns a test decision for a two-sample Kolmogorov-Smirnov test with additional options specified by one or more name-value pair arguments. For example, you can change the significance level or conduct a one-sided test.```

example

``````[h,p] = kstest2(___)``` also returns the asymptotic p-value `p`, using any of the input arguments from the previous syntaxes.```

example

``````[h,p,ks2stat] = kstest2(___)``` also returns the test statistic `ks2stat`.```

## Examples

collapse all

Generate sample data from two different Weibull distributions.

```rng(1); % For reproducibility x1 = wblrnd(1,1,1,50); x2 = wblrnd(1.2,2,1,50);```

Test the null hypothesis that data in vectors `x1` and `x2` comes from populations with the same distribution.

`h = kstest2(x1,x2)`
```h = logical 1 ```

The returned value of `h = 1` indicates that `kstest` rejects the null hypothesis at the default 5% significance level.

Generate sample data from two different Weibull distributions.

```rng(1); % For reproducibility x1 = wblrnd(1,1,1,50); x2 = wblrnd(1.2,2,1,50);```

Test the null hypothesis that data vectors `x1` and `x2` are from populations with the same distribution at the 1% significance level.

`[h,p] = kstest2(x1,x2,'Alpha',0.01)`
```h = logical 0 ```
```p = 0.0317 ```

The returned value of `h = 0` indicates that `kstest` does not reject the null hypothesis at the 1% significance level.

Generate sample data from two different Weibull distributions.

```rng(1); % For reproducibility x1 = wblrnd(1,1,1,50); x2 = wblrnd(1.2,2,1,50);```

Test the null hypothesis that data in vectors `x1` and `x2` comes from populations with the same distribution, against the alternative hypothesis that the cdf of the distribution of `x1` is larger than the cdf of the distribution of `x2`.

`[h,p,k] = kstest2(x1,x2,'Tail','larger')`
```h = logical 1 ```
```p = 0.0158 ```
```k = 0.2800 ```

The returned value of `h = 1` indicates that `kstest` rejects the null hypothesis, in favor of the alternative hypothesis that the cdf of the distribution of `x1` is larger than the cdf of the distribution of `x2`, at the default 5% significance level. The returned value of `k` is the test statistic for the two-sample Kolmogorov-Smirnov test.

## Input Arguments

collapse all

Sample data from the first sample, specified as a vector. Data vectors `x1` and `x2` do not need to be the same size.

Data Types: `single` | `double`

Sample data from the second sample, specified as a vector. Data vectors `x1` and `x2` do not need to be the same size.

Data Types: `single` | `double`

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `'Tail','larger','Alpha',0.01` specifies a test using the alternative hypothesis that the empirical cdf of `x1` is larger than the empirical cdf of `x2`, conducted at the 1% significance level.

Significance level of the hypothesis test, specified as the comma-separated pair consisting of `'Alpha'` and a scalar value in the range (0,1).

Example: `'Alpha',0.01`

Data Types: `single` | `double`

Type of alternative hypothesis to evaluate, specified as the comma-separated pair consisting of `'Tail'` and one of the following.

 `'unequal'` Test the alternative hypothesis that the empirical cdf of `x1` is unequal to the empirical cdf of `x2`. `'larger'` Test the alternative hypothesis that the empirical cdf of `x1` is larger than the empirical cdf of `x2`. `'smaller'` Test the alternative hypothesis that the empirical cdf of `x1` is smaller than the empirical cdf of `x2`.

If the data values in `x1` tend to be larger than those in `x2`, the empirical distribution function of `x1` tends to be smaller than that of `x2`, and vice versa.

Example: `'Tail','larger'`

## Output Arguments

collapse all

Hypothesis test result, returned as a logical value.

• If `h` `= 1`, this indicates the rejection of the null hypothesis at the `Alpha` significance level.

• If `h` `= 0`, this indicates a failure to reject the null hypothesis at the `Alpha` significance level.

Asymptotic p-value of the test, returned as a scalar value in the range (0,1). `p` is the probability of observing a test statistic as extreme as, or more extreme than, the observed value under the null hypothesis. The asymptotic p-value becomes very accurate for large sample sizes, and is believed to be reasonably accurate for sample sizes `n1` and `n2`, such that `(n1*n2)/(n1 + n2)``4`.

Test statistic, returned as a nonnegative scalar value.

collapse all

### Two-Sample Kolmogorov-Smirnov Test

The two-sample Kolmogorov-Smirnov test is a nonparametric hypothesis test that evaluates the difference between the cdfs of the distributions of the two sample data vectors over the range of x in each data set.

The two-sided test uses the maximum absolute difference between the cdfs of the distributions of the two data vectors. The test statistic is

`${D}^{*}=\underset{x}{\mathrm{max}}\left(|{\stackrel{^}{F}}_{1}\left(x\right)-{\stackrel{^}{F}}_{2}\left(x\right)|\right),$`

where ${\stackrel{^}{F}}_{1}\left(x\right)$ is the proportion of `x1` values less than or equal to x and ${\stackrel{^}{F}}_{2}\left(x\right)$ is the proportion of `x2` values less than or equal to x.

The one-sided test uses the actual value of the difference between the cdfs of the distributions of the two data vectors rather than the absolute value. The test statistic is

`${D}^{*}=\underset{x}{\mathrm{max}}\left({\stackrel{^}{F}}_{1}\left(x\right)-{\stackrel{^}{F}}_{2}\left(x\right)\right).$`

## Algorithms

In `kstest2`, the decision to reject the null hypothesis is based on comparing the p-value `p` with the significance level `Alpha`, not by comparing the test statistic `ks2stat` with a critical value.

 Massey, F. J. “The Kolmogorov-Smirnov Test for Goodness of Fit.” Journal of the American Statistical Association. Vol. 46, No. 253, 1951, pp. 68–78.

 Miller, L. H. “Table of Percentage Points of Kolmogorov Statistics.” Journal of the American Statistical Association. Vol. 51, No. 273, 1956, pp. 111–121.

 Marsaglia, G., W. Tsang, and J. Wang. “Evaluating Kolmogorov’s Distribution.” Journal of Statistical Software. Vol. 8, Issue 18, 2003.