Fast adaptive kernel density estimation in high dimensions in one m-file.
Provides optimal accuracy/speed trade-off, controlled via a parameter "gam";
To increase speed for "big data" applications, use small "gam";
Typically gam=n^(1/2), where "n" is the number of points. '
X - data as a 'n' by 'd' vector;
grid - 'm' points of dimension 'd' over which pdf is computed;
default provided only for 2-dimensional data;
see example below on how to construct it in higher dimensions;
gam - (optional) cost/accuracy trade-off parameter, where gam<n;
default value is gam=ceil(n^(1/2)); larger values
may result in better accuracy, but reduce speed;
to speedup the code, use smaller "gam";
pdf - the value of the estimated density at 'grid'
X1,X2 - default grid (used only for 2 dimensional data)
see example on how to construct grid on higher dimensions
EXAMPLE IN 2 DIMENSIONS:
EXAMPLE IN 3 DIMENSIONS:
data=[randn(10^3,3);randn(10^3,3)/2+2]; % three dimensional data
[n,d]=size(data); ng=100; % total grid points = ng^d
MAX=max(data,,1); MIN=min(data,,1); scaling=MAX-MIN;
% create meshgrid in 3-dimensions
grid=reshape([X1(:),X2(:),X3(:)],ng^d,d); % create points for plotting
pdf=akde(data,grid); % run adaptive kde
pdf=reshape(pdf,size(X1)); % reshape pdf for use with meshgrid
for iso=[0.005:0.005:0.015] % isosurfaces with pdf = 0.005,0.01,0.015
isosurface(X1,X2,X3,pdf,iso),view(3),alpha(.3),box on,hold on,colormap cool
Kernel density estimation via diffusion
Z. I. Botev, J. F. Grotowski, and D. P. Kroese (2010)
Annals of Statistics, Volume 38, Number 5, pages 2916-2957.
Zdravko Botev (2020). Kernel Density Estimator for High Dimensions (https://www.mathworks.com/matlabcentral/fileexchange/58312-kernel-density-estimator-for-high-dimensions), MATLAB Central File Exchange. Retrieved .
Could anyone provide any resources explaining this method? There doesn't seem to be any mention about it in the linked paper, nor have I been able to find it elsewhere.
Thank you for providing the code. I am using it to apply Kernel density on maps which have lat/lon coordinates.
I have a 2D data (83 rows X 92 columns), which is map of temperature. I need to produce map of hotspot areas by considering different number of grids.
When I apply the example provided in the code for 2 dimensional data, I am getting the error; “Output argument "X1" (and maybe others) not assigned during call to "akde".” The example works only for 2 column data. However, my data have 83 rows X 92 columns.
Help me on how can I adapt it the code please.
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!