Kernel density estimation with bandwidths locally adapted to data.
First thing to do:
Run a tutorial code, tutorial.m
Documentation:
http://2000.jukuin.keio.ac.jp/shimazaki/res/kernel.html
see also sskernel for optimization of a fixed kernel bandwidth and sshist for histogram optimization.
% [y,t,optw,gs,C,confb95,yb] = ssvkernel(x,t,W)
%
% Function `ssvkernel' returns an optimized kernel density estimate
% using a Gauss kernel function with bandwidths locally adapted to the data.
%
% Examples:
% >> x = 0.50.5*log(rand(1,1e3)); t = linspace(0,3,500);
% >> [y,t,optw] = ssvkernel(x,t);
% This example produces a vector of kernel density estimates, y, at points
% specified in a vector t, using locally adaptive bandwidths, optw
% (a standard deviation of a normal density function).
%
% >> ssvkernel(x);
% By calling the function without output arguments, the estimated density
% is displayed.
%
% Input arguments:
% x: Sample data vector.
% tin (optinal):
% Points at which estimation are computed.
% W (optinal):
% A vector of kernel bandwidths.
% If W is provided, the optimal bandwidth is selected from the
% elements of W.
% * Do not search bandwidths smaller than a sampling resolution of data.
% If W is not provided, the program searches the optimal bandwidth
% using a golden section search method.
%
% Output arguments:
% y: Estimated density
% t: Points at which estimation was computed.
% The same as tin if tin is provided.
% (If the sampling resolution of tin is smaller than the sampling
% resolution of the data, x, the estimation was done at smaller
% number of points than t. The results, t and y, are obtained by
% interpolating the low resolution sampling points.)
% optw: Optimal kernel bandwidth.
% gs: Stiffness constants of the variable bandwidth examined.
% The stifness constant is defined as a ratio of the optimal fixed
% bandwidth to a length of a local interval in which a fixedkernel
% bandwidth optimization was performed.
% C: Cost functions of stiffness constants.
% conf95:
% Bootstrap confidence intervals.
% yb: Booststrap samples.
%
%
% Usage:
% >> [y,t,optw] = ssvkernel(x);
% When t is not given in the input arguments, i.e., the output argument t
% is generated automatically.
%
% >> W = linspace(0.01,1,20);
% >> [y,t,optw] = ssvkernel(x,t,W);
% The optimal bandwidth is selected from the elements of W.
%
% >> [y,t,optw,confb95,yb] = ssvkernel(x);
% This additionally computes 95% bootstrap confidence intervals, confb95.
% The bootstrap samples are provided as yb.
%
%
% Optimization principle:
% The optimization is based on a principle of minimizing
% expected L2 loss function between the kernel estimate and an unknown
% underlying density function. An assumption is merely that samples
% are drawn from the density independently each other.
%
% The locally adaptive bandwidth is obtained by iteratively computing
% optimal fixedsize bandwidths wihtihn local intervals. The optimal
% bandwidths are selected such that they are selected in the intervals
% that are \gamma times larger than the optimal bandwidths themselves.
% The paramter \gamma was optimized by minimizing the L2 risk estimate.
%
% The method is described in
% Hideaki Shimazaki and Shigeru Shinomoto
% Kernel Bandwidth Optimization in Spike Rate Estimation
% Journal of Computational Neuroscience 2010
% http://dx.doi.org/10.1007/s1082700901804
%
% * Instead of a Gaussian window described in the paper, a boxcar window is
% used in this program.
%
% For more information, please visit
% http://2000.jukuin.keio.ac.jp/shimazaki/res/kernel.html
%
% See also SSKERNEL, SSHIST
%
%
% Hideaki Shimazaki
% http://2000.jukuin.keio.ac.jp/shimazaki
1.3  Bug fix: fixed a problem for large values 

1.2  fixed issues on compatibility to older versions of matlab. performance optimization. 

1.1  correcting url in Description 
Thomas John (view profile)
Dear Mr. Shimazaki
really a great contribution. Here I have a small script which described my problem an a (partial) solution.
% script to estimate the probability function(pdf) in case of periodic boundary conditions
% it uses the great "Locally Adaptive Kernel Density Estimation" from Hideaki Shimazaki
%
% Problem statement: we want to estimate the pdf of random directions
% direction phi=[0..2pi] e.g. uniform > phi = 2*pi*rand(n,1)
% the option 'support' in ksdensity of the matlab statistics toolbox uses a
% log transformation: ty = log(yDataL)  log(UyData);
% L = lower bound U = upper bound
% this nonlinear transformation under/overestimates the values closed to the boundries :(
%
% idea: extend the dataset with same data in both directions, estimate
% the pdf and use only the original support
clear variables; close all;
nrand = 500; npoint = 100;
L = 0; U = 2*pi;
phi = L + (UL).*rand(nrand,1);
xi=linspace(L,U,npoint);
% classical histogram
b = histc(phi,xi);
bar(xi,b/sum(b)/(xi(2)xi(1)),1);
h = findobj(gca,'Type','patch');
set(h,'FaceColor',.7*[1 1 1],'EdgeColor',0.8*[1 1 1]);
%
% classical kernel estimate with finite support
[pdf,xi] = ksdensity(phi,xi,'support',[L U]);
% adaptive kernel estimate without periodic support
[pdfadapt,xiadapt] = ssvkernel(phi,xi);
hold on;plot(xi,pdf,'b',xiadapt,pdfadapt,'r','LineWidth',2);
% extended dataset
phiext=[phi(UL) phi phi+(UL)];
% extended xpoints too
xiext=[ xi(UL) xi xi+(UL)] ;
% but remove double entries from begin and end
xiext=unique(xiext);
% again adaptive kernel estimate
[pdfextadapt,xiextadapt] = ssvkernel(phiext,xiext);
% normalisation to integral(pdf)[L..U] = 1;
pdfextadapt=3*pdfextadapt(npoint:2*npoint);
% also back to original support
xiext=xiext(npoint:2*npoint);
%
plot(xiext,pdfextadapt,'m','LineWidth',3);xlim([L U]);
title({'pdf of uniform distributed, random numbers';...
'with ~periodic~ boundaries'})
legend('histogram',['ksdensity with support [', num2str(L), '...', ...
num2str(U), ']'], 'adaptive estimation',...
'adaptive + periodic assumption');
fprintf(' The Integral is %f\n',trapz(pdfadapt)*(UL)/npoint);
% comment: this approach is straight forward and gives already good results
% but it is not really a periodic boundary solution, to do this, a deeper change
% in the kernel estimation calculation with the mod function is required
% may be later, Thomas
Lee (view profile)