Henze and Zirkler (1990) introduce a multivariate version of the univariate There are many tests for assessing the multivariate normality in the statistical literature (Mecklin and Mundfrom, 2003). Unfortunately, there is no known uniformly most powerful test and it is recommended to perform several test to assess it. It has been found that the Henze and Zirkler test have a good overall power against alternatives to normality.
The Henze-Zirkler test is based on a nonnegative functional distance that measures the distance between two distribution functions: the characteristic function of the multivariate normality and the empirical characteristic function.
The Henze-Zirkler statistic is approximately distributed as a lognormal. The lognormal distribution is used to compute the null hypothesis probability.
According to Henze-Wagner (1997), this test has the desirable properties of,
--consistency against each fixed nonnormal alternative distribution
--asymptotic power against contiguous alternatives of order n^-1/2
--feasibility for any dimension and any sample size
If the data is multivariate normality, the test statistic HZ is approximately lognormally distributed. It proceeds to calculate the mean, variance and smoothness parameter. Then, mean and variance are lognormalized and the P-value is estimated.
Also, for all the interested people, we provide the lognormal critical value.
X - data matrix (size of matrix must be n-by-p; data=rows,
c - covariance normalized by n (=1, default)) or n-1 (~=1)
alpha - significance level (default = 0.05)
- Henze-Zirkler's Multivariate Normality Test
An actual Dr. Henze email address is Henze@kit.edu
Prof. Antonio Trujillo-Ortiz
Dear Prof. Antonio Trujillo-Ortiz,
Thanks for your quick reply. It was helpful. Now I understand. We need to get the LN-mean and LN-standard-deviation in order to get the mean ('mu') and standard deviation ('sqrt(si2)' ) from LNRVs.
Thank you very much/Muchos gracias,
Thanks for your interest of this m-file. As you know, we only generated the Matlab computational algorithm from the original published paper. If you have any mathematical or statistical fundamentals inquiry you must refer to the author(s). I give you the Dr. Norbert Henze's email address
However, we don't need the lognormal cdf (logncdf) neither the mean and variance of the lognormal distribution (longnstat) functions. For the used mean (mu) and variance (si2), which are the Henze-Zirkler mean and variance, must, as the author establish, to be converted to the lognormal Henze-Zirkler mean and variance. As you can see, using the provied Iris data example. The mean(mu)=0.7635 and variance(si2)=0.0112. With a Henze-Zirkler lognormal mean: -0.279408 and Henze-Zirkler lognormal variance: 0.1379069. But, if you try to use te mu and si2 values by the longnstat function, you get the incorrect lognormal values of 2.1459 and 5.7768e-004, respecively.
Prof. Antonio Trujillo-Ortiz
Dear Sir Antonio,
Could you explain/describe the mean and standard deviation arguments to the Log normal cdf? Why not use 'mu' and 'sqrt(si2)' directly?
Your valuable comment (15-12-08)for this m-file runs much faster was taken into account. We thank you.
Nice implemetation and sample data to test the file on. However, it runs a bit slow. I recompted the variable Djk by avoiding loops and it runs much faster:
Djk = - 2*Y' + diag(Y')*ones(1,n) + ones(n,1)*diag(Y')';
In order to improve this m-file. The valuable comment of Johan (J.D.) '...avoiding loops and it (the file) runs much faster.' (15/12/08) was taken into account. We thank Johan.
Data set file was reentered.
Text was improved.
It was added an appropriate format to cite this file.
Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.