Differential (continuous) entropy estimator
Version 1.0.2 (7.47 KB) by
Gil Ariel
Estimate the entropy of continuous random variables from independent samples. Both univariate (1D) and multivariate (D>1) distributions.
The function estimates the entropy of continuous random variables, both univariate (1D) and multivariate (D>1), from independent samples.
If you use this function, please be courteous and cite
G. Ariel and Y. Louzoun, Estimating differential entropy using recursive copula splitting, Entropy 22(2), 236 (2020) [open access link].
If you would like to report a bug or suggest additions to the code, please email arielg@math.biu.ac.il.
In particular, if you have suggestions for additional estimation methods (which you have implemented in Matlab), I'll be happy to add it to the code and include proper references.
Simple usage:
See examples.m for examples of simple usage.
If your data consists of N independent samples of vectors with dimension D, the input is a NxD matrix in which each row is a sample.
Calculate the entropy H using
H=differential_entropy(x);
The most important option is the support of the distribution. If you know it a-priory, then specifying it can improve the estimate significantly (see examples in the documentation file).
The support of the distribution is specified by a 2xD matrix: First row are lower limits, second row are upper limits.
H=differential_entropy(x,support);
Notes:
- For univariate distributions, both the data and the support may be written as row vectors instead of column vectors.
- If you do not know the support a-priory, it is usually better not to specify it than to use min(x) and max(x).
- You may specify infinite support using –Inf or Inf. The function will disregard the support and treat it as unspecified.
Choosing a different estimation methods:
If the support is not known of infinite:
H=differential_entropy(x,method);
If the support is finite and known:
H=differential_entropy(x,support,method);
Implemented 1D estimators:
'spacings': Default without support. Uses sample spacings, see [1] for details. The default spacing is k=0.2N^0.3.
'bins': Partitions the data into fixed-length bins and estimates the discrete entropy (with the proper constant for the spatial scale). The default number of bins is N^0.4.
'plugin': Uses have the data (randomly chosen) to create a histogram with fixed-length bins and plugs in the other half to estimate the entropy. See [1] for details. The default number of bins is N^0.4.
'leave1out': For each sample, use the rest of the data (N-1 samples) to create a histogram with fixed-length bins and plugs in the sample which was "left out" to estimate the entropy. Average over all samples. See [1] for details. The default number of bins is N^0.4.
'average': Default with support. The average of the 'bins' and
'leave1out' methods.
'vanEs': Applies the method suggested in [2]. A different way to apply sample spacings. The default spacing is k=0.2N^0.3.
'Correa': Applies the method suggested in [3]. A different way to apply sample spacings. The default spacing is k=0.2N^0.3.
'default'|'auto': Without finite support, use 'spacings'. With finite support, use 'average'.
Implemented mulivariate (D>1) estimators:
'KL': The original Kozachenko-Leonenko estimator using 1 nearest neighbors [4].
'kNN': The Kozachenko-Leonenko estimator using knearest neighbors [4]. With k=1, this is the same as 'KL'. The default number of neighbors is k=N^0.3. In order to find nearest neighbors, the functions applies Matlab's builtin function knnsearch with default settings.
'copula': Default. Uses recursive copula splitting. See [5] for details.
'default'|'auto': Use 'copula' (with finite, infinite and unknown).
Additional parameters:
Setting the spacings/number of nearest neighbors when applicable (methods 'spacings','vanEs','Correa','kNN'):
H=differential_entropy(x,method,k);
Or
H=differential_entropy(x,[],method,k);
Setting the number of bins when applicable (methods 'bins','plugin', 'leave1out','average'):
H=differential_entropy(x,support,method,k);
References:
[1] J. Beirlant, E.J. Dudewicz, L. Györfi and E.C. van der Meulen, Nonparametric entropy estimation: An overview. Int. J. Math. Stat. Sci. 6, 17 (1997).
[2] B. van Es, Estimating functionals related to a density by a class of statistics based on spacings. Scandinavian Journal of Statistics 19, 61 (1992).
[3] J.C. Correa, A new estimator of entropy. Communications in Statistics - Theory and Methods 24, 2439 (1995).
[4] L. Kozachenko and N.N. Leonenko, Sample estimate of the entropy of a random vector. Probl. Peredachi Informatsii 1987, 23, 9–16.
[5] G. Ariel and Y. Louzoun, Estimating differential entropy using recursive copula splitting, Entropy 22, 236 (2020).
Cite As
G. Ariel and Y. Louzoun, Estimating differential entropy using recursive copula splitting, Entropy 22, 236 (2020).
MATLAB Release Compatibility
Created with
R2021b
Compatible with any release
Platform Compatibility
Windows macOS LinuxTags
Discover Live Editor
Create scripts with code, output, and formatted text in a single executable document.
