File Exchange

image thumbnail

Locally Adaptive Kernel Density Estimation

version 1.3 (788 KB) by

Kernel density estimation with bandwidths locally adapted to data.



View License

First thing to do:
Run a tutorial code, tutorial.m


see also sskernel for optimization of a fixed kernel bandwidth and sshist for histogram optimization.

% [y,t,optw,gs,C,confb95,yb] = ssvkernel(x,t,W)
% Function `ssvkernel' returns an optimized kernel density estimate
% using a Gauss kernel function with bandwidths locally adapted to the data.
% Examples:
% >> x = 0.5-0.5*log(rand(1,1e3)); t = linspace(0,3,500);
% >> [y,t,optw] = ssvkernel(x,t);
% This example produces a vector of kernel density estimates, y, at points
% specified in a vector t, using locally adaptive bandwidths, optw
% (a standard deviation of a normal density function).
% >> ssvkernel(x);
% By calling the function without output arguments, the estimated density
% is displayed.
% Input arguments:
% x: Sample data vector.
% tin (optinal):
% Points at which estimation are computed.
% W (optinal):
% A vector of kernel bandwidths.
% If W is provided, the optimal bandwidth is selected from the
% elements of W.
% * Do not search bandwidths smaller than a sampling resolution of data.
% If W is not provided, the program searches the optimal bandwidth
% using a golden section search method.
% Output arguments:
% y: Estimated density
% t: Points at which estimation was computed.
% The same as tin if tin is provided.
% (If the sampling resolution of tin is smaller than the sampling
% resolution of the data, x, the estimation was done at smaller
% number of points than t. The results, t and y, are obtained by
% interpolating the low resolution sampling points.)
% optw: Optimal kernel bandwidth.
% gs: Stiffness constants of the variable bandwidth examined.
% The stifness constant is defined as a ratio of the optimal fixed
% bandwidth to a length of a local interval in which a fixed-kernel
% bandwidth optimization was performed.
% C: Cost functions of stiffness constants.
% conf95:
% Bootstrap confidence intervals.
% yb: Booststrap samples.
% Usage:
% >> [y,t,optw] = ssvkernel(x);
% When t is not given in the input arguments, i.e., the output argument t
% is generated automatically.
% >> W = linspace(0.01,1,20);
% >> [y,t,optw] = ssvkernel(x,t,W);
% The optimal bandwidth is selected from the elements of W.
% >> [y,t,optw,confb95,yb] = ssvkernel(x);
% This additionally computes 95% bootstrap confidence intervals, confb95.
% The bootstrap samples are provided as yb.
% Optimization principle:
% The optimization is based on a principle of minimizing
% expected L2 loss function between the kernel estimate and an unknown
% underlying density function. An assumption is merely that samples
% are drawn from the density independently each other.
% The locally adaptive bandwidth is obtained by iteratively computing
% optimal fixed-size bandwidths wihtihn local intervals. The optimal
% bandwidths are selected such that they are selected in the intervals
% that are \gamma times larger than the optimal bandwidths themselves.
% The paramter \gamma was optimized by minimizing the L2 risk estimate.
% The method is described in
% Hideaki Shimazaki and Shigeru Shinomoto
% Kernel Bandwidth Optimization in Spike Rate Estimation
% Journal of Computational Neuroscience 2010
% * Instead of a Gaussian window described in the paper, a boxcar window is
% used in this program.
% For more information, please visit
% Hideaki Shimazaki

Comments and Ratings (2)

Thomas John

Dear Mr. Shimazaki

really a great contribution. Here I have a small script which described my problem an a (partial) solution.

% script to estimate the probability function(pdf) in case of periodic boundary conditions
% it uses the great "Locally Adaptive Kernel Density Estimation" from Hideaki Shimazaki
% Problem statement: we want to estimate the pdf of random directions
% direction phi=[0..2pi] e.g. uniform -> phi = 2*pi*rand(n,1)
% the option 'support' in ksdensity of the matlab statistics toolbox uses a
% log transformation: ty = log(yData-L) - log(U-yData);
% L = lower bound U = upper bound
% this nonlinear transformation under/over-estimates the values closed to the boundries :-(
% idea: extend the dataset with same data in both directions, estimate
% the pdf and use only the original support
clear variables; close all;
nrand = 500; npoint = 100;
L = 0; U = 2*pi;
phi = L + (U-L).*rand(nrand,1);
% classical histogram
b = histc(phi,xi);
h = findobj(gca,'Type','patch');
set(h,'FaceColor',.7*[1 1 1],'EdgeColor',0.8*[1 1 1]);
% classical kernel estimate with finite support
[pdf,xi] = ksdensity(phi,xi,'support',[L U]);
% adaptive kernel estimate without periodic support
[pdfadapt,xiadapt] = ssvkernel(phi,xi);
hold on;plot(xi,pdf,'-b',xiadapt,pdfadapt,'-r','LineWidth',2);
% extended dataset
phiext=[phi-(U-L) phi phi+(U-L)];
% extended x-points too
xiext=[ xi-(U-L) xi xi+(U-L)] ;
% but remove double entries from begin and end
% again adaptive kernel estimate
[pdfextadapt,xiextadapt] = ssvkernel(phiext,xiext);
% normalisation to integral(pdf)[L..U] = 1;
% also back to original support
plot(xiext,pdfextadapt,'-m','LineWidth',3);xlim([L U]);
title({'pdf of uniform distributed, random numbers';...
       'with ~periodic~ boundaries'})
legend('histogram',['ksdensity with support [', num2str(L), '...', ...
       num2str(U), ']'], 'adaptive estimation',...
       'adaptive + periodic assumption');
fprintf(' The Integral is %f\n',trapz(pdfadapt)*(U-L)/npoint);

% comment: this approach is straight forward and gives already good results
% but it is not really a periodic boundary solution, to do this, a deeper change
% in the kernel estimation calculation with the mod function is required
% may be later, Thomas



Lee (view profile)



Bug fix: fixed a problem for large values


fixed issues on compatibility to older versions of matlab. performance optimization.


correcting url in Description

MATLAB Release
MATLAB 7.8 (R2009a)

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video