Note: Use only in the MuPAD Notebook Interface. This functionality does not run in MATLAB. |
For the classical chi-square goodness-of-fit test, MuPAD^{®} provides
the stats::csGOFT
function.
This function enables you to test the data against an arbitrary function f
.
For example, you can define f
by using any of the
cumulative distribution functions, probability density functions,
and discrete probability functions available in the MuPAD Statistics library.
You also can define f
by using your own distribution
function. For example, create the data sequence x
that
contains a thousand random entries:
reset()
f := stats::normalRandom(0, 1/2): x := f() $ k = 1..1000:
Suppose, you want to test whether the entries of that sequence are normally distributed with the mean equal to 0 and the variance equal to 1/2. The classical chi-square test uses the following three-step approach:
Divide the line of real values into several intervals (also called bins or cells).
Compute the number of data elements in each interval.
Compare those numbers with the numbers expected for the specified distribution.
When you use the stats::csGOFT
function,
specify the cell boundaries as an argument. You must specify at least
three cells. The recommended minimum number of cells for a sample
of n
data elements is
.
The recommended method for defining the cells is to use the stats::equiprobableCells
function.
This function creates equiprobable cells when the underlying distribution
is continuous:
q := stats::normalQuantile(0, 1/2): cells := stats::equiprobableCells(40, q):
Now, call the stats::csGOFT
function
to test the data sequence x
. For example, compare x
with
the cumulative normal distribution function with the same mean and
variance. The stats::csGOFT
returns
a large p-value for this test. Therefore, the null hypothesis (x
is
normally distributed with the mean equal to 0 and the variance equal
to 1/2) passes this test. Besides the p-value, stats::csGOFT
returns the observed value
of the chi-square statistics and the minimum of the expected cell
frequencies:
stats::csGOFT(x, cells, CDF = stats::normalCDF(0, 1/2))
The stats::csGOFT
enables
you to test the data against any distribution function. For example,
testing the sequence x
against the probability
density function gives the same result:
stats::csGOFT(x, cells, PDF = stats::normalPDF(0, 1/2))
If you test the same data sequence x
against
the normal distribution function with different values of the mean
and the variance, stats::csGOFT
returns
the p-value that is below the typical significance level 0.05. The
null hypothesis does not pass the test:
stats::csGOFT(x, cells, CDF = stats::normalCDF(0, 1))