The code implements an approximation of the multivariate bandwidth calculation from [1]. In contrast to other multivariate bandwidth estimators, it can be estimated from a preclustered sample distribution offering a simple way of estimating compact and accurate KDEs with variable kernels.
The code provides a C source code for the engine of calculation and a routine to compile it automatically in Matlab.
The code includes three demos:
1. Multivariate KDE: demoBW_Estimation.m (it also compiles your code)
2. 1D KDE: demoBW_Estimation1D.m
3. Multivariate KDE with preclustering: demoBW_with_preclustering
Reasons to use the bandwidth estimator from [1]:
* Reasonably fast computation
* Handles multivariate bandwidths
* Can use weighted data
* Generally produces good estimates of the bandwidths
* Can be calculated from a Gaussian mixture model, not only directly from the samples
* Avoids numerical evaluations and iterative computation -- the bandwidth is analytically computed (even from a GMM) under some approximations.
Some advice:
If you're trying to estimate the KDE from "really" large datasets, then I suggest one of two things: (i) precluster the data first and apply [1]. (2) Use the online KDE, which learns the model by one data at a time -- the Matlab code for the oKDE is available from the author's homepage (http://www.vicos.si/People/Matejk).
[1] M. Kristan, A. Leonardis, D. Skočaj, "Multivariate online Kernel Density Estimation with Gaussian Kernels", Pattern Recognition, 2011. |