adaptive kernel density estimation in one-dimension

version (3.72 KB) by Zdravko Botev
fast and reliable adaptive kernel density estimator


Updated 21 Jul 2016

Fast adaptive kernel density estimation in one-dimension in one m-file;
Provides optimal accuracy/speed trade-off. To increase speed when dealing with "big data",
simply reduce the "gam" parameter; Typically "gam=n^(1/3)", where "n" is the length of data.

% [pdf,grid]=akde1d(X,grid,gam)
X - data as a 'n' by '1' vector;
grid - (optional) mesh over which density is to be computed;
default mesh uses 2^12 points over range of data;
gam - (optional) cost/accuracy trade-off parameter, where gam<n;
default value is gam=ceil(n^(1/3))+20; larger values
result in better accuracy, but reduce speed;
to speedup the code, use smaller "gam";

pdf - the value of the estimated density at 'grid'

data=[exp(randn(10^3,1))]; % log-normal sample
[pdf,grid]=akde1d(data); plot(grid,pdf)

Note: If you need a very fast estimator use my "kde.m" function.
This routine is more adaptive at the expense of speed. Use "gam" to control a speed/accuracy tradeoff.

Kernel density estimation via diffusion
Z. I. Botev, J. F. Grotowski, and D. P. Kroese (2010)
Annals of Statistics, Volume 38, Number 5, pages 2916-2957.

Excellent practical results. I'm curious what is the form of the diffusion equation implemented in the code?

Mladen Dalto

It seams line 42 has an error in default gamma:
gam > n when n < 23
so in that case mu=X(perm(1:gam),:) can not be indexed

MATLAB Release Compatibility
Created with R2016a
Compatible with any release
Platform Compatibility
Windows macOS Linux

