Description |
% Reliable and extremely fast kernel density estimator for one-dimensional data;
% Gaussian kernel is assumed and the bandwidth is chosen automatically;
% Unlike many other implementations, this one is immune to problems
% caused by multimodal densities with widely separated modes (see example). The
% estimation does not deteriorate for multimodal densities, because we never assume
% a parametric model for the data.
% INPUTS:
% data - a vector of data from which the density estimate is constructed;
% n - the number of mesh points used in the uniform discretization of the
% interval [MIN, MAX]; n has to be a power of two; if n is not a power of two, then
% n is rounded up to the next power of two, i.e., n is set to n=2^ceil(log2(n));
% the default value of n is n=2^12;
% MIN, MAX - defines the interval [MIN,MAX] on which the density estimate is constructed;
% the default values of MIN and MAX are:
% MIN=min(data)-Range/10 and MAX=max(data)+Range/10, where Range=max(data)-min(data);
% OUTPUTS:
% bandwidth - the optimal bandwidth (Gaussian kernel assumed);
% density - column vector of length 'n' with the values of the density
% estimate at the grid points;
% xmesh - the grid over which the density estimate is computed;
% - If no output is requested, then the code automatically plots a graph of
% the density estimate.
% cdf - column vector of length 'n' with the values of the cdf
% Reference:
% Kernel density estimation via diffusion
% Z. I. Botev, J. F. Grotowski, and D. P. Kroese (2010)
% Annals of Statistics, Volume 38, Number 5, pages 2916-2957.
%
% Example:
% data=[randn(100,1);randn(100,1)*2+35 ;randn(100,1)+55];
% kde(data,2^14,min(data)-5,max(data)+5);
% Notes: If you have a more reliable and accurate one-dimensional kernel density
% estimation software, please email me at botev@maths.uq.edu.au |