A very fast multivariate bandwidth calculation for KDE that can even be calculated from a GMM.
The code implements an approximation of the multivariate bandwidth calculation from [1]. In contrast to other multivariate bandwidth estimators, it can be estimated from a preclustered sample distribution offering a simple way of estimating compact and accurate KDEs with variable kernels.
The code provides a C source code for the engine of calculation and a routine to compile it automatically in Matlab.
The code includes three demos:
1. Multivariate KDE: demoBW_Estimation.m (it also compiles your code)
2. 1D KDE: demoBW_Estimation1D.m
3. Multivariate KDE with preclustering: demoBW_with_preclustering
Reasons to use the bandwidth estimator from [1]:
* Reasonably fast computation
* Handles multivariate bandwidths
* Can use weighted data
* Generally produces good estimates of the bandwidths
* Can be calculated from a Gaussian mixture model, not only directly from the samples
* Avoids numerical evaluations and iterative computation  the bandwidth is analytically computed (even from a GMM) under some approximations.
Some advice:
If you're trying to estimate the KDE from "really" large datasets, then I suggest one of two things: (i) precluster the data first and apply [1]. (2) Use the online KDE, which learns the model by one data at a time  the Matlab code for the oKDE is available from the author's homepage (http://www.vicos.si/People/Matejk).
[1] M. Kristan, A. Leonardis, D. Skočaj, "Multivariate online Kernel Density Estimation with Gaussian Kernels", Pattern Recognition, 2011.
1.7  Corrected plotting of a 1d distribution and some includes for properly installing the path to plotting tools. 

1.5  added link to homepage 

1.4  updated the "Description" field. 

1.3  Some correction of the code for moment matching "momentMatchPdf.m" and additional example of very fast 1D KDE estimation "demoBW_Estimation1D.m". 
Matej Kristan (view profile)
Dear Matthew,
It is important to correctly specify what you mean with "data stored in separate vectors". I understand that you would like to estimate the joint distribution in 3D space over the three variables. Obviously you have data stored in three separate vectors. But is the "i"th value in each of the vectors obtained at the same measuring instance? If that is the case, just concatenate the vectors in 3x1000 matrix and give it to the KDE (make sure you transpose it correctly) and you're done  you'll get the joint pdf. But if the three measurement vectors are independent, then you can estimate only the marginals. I.e., a 1D pdf over each individual measurement vector. If you can assume that the measurements are indeed independent, then the joint pdf will be just the product of the marginals. Given your large number of points, you can perform a preclustering (the code allows it if I remember correctly). Alternative is to use the oKDE  an online kernel density estimator that can learn the distribution from thousands of datapoints and will produce a model with low complexity. The oKDE is available from my homepage.
Matthew (view profile)
Dear Matej,
Thank you for uploading this for the community. I have a question and I was hoping you could help me a little. I want to use KDE to obtain the joint probability distribution function of three continuous random variables, my data is stored in vectors of 1000 elements each, is it possible to estimate the joint PDF of them using your program? Maybe I'm trying to use it incorrectly.
Thank you!
Matej Kristan (view profile)
Dear Douglas,
Since you have data x separated from y, you can only estimate the marginals, which means that you assume independent pdfs in x and y. I.e., a 1D pdf on x, p(x), and a 1D pdf on y, p(y). So evaluation of the joint probability p(x,y) under the independence assumption that you implicitly make becomes p(x,y)=p(x)p(y). This means that if you want to compute the probabiltiy of (x1,y1), you will get this by p(x1)*p(y1).
Douglas (view profile)
Matej Kristan, tkx for the code. Please, check if you can help me. Imagine that x and y are vectors and each one has 100 elements. So, I want to estimate the joint pdf of x and y, that is, pdf=dist(x,y). But, I want with this pdf the probability density of combinations of (x,y) that are not in the x and y used to estimate the distribution. Is it possibile with you code? I think this can be done using the function "visualizePdf2d2" and adjusting the "gran" tag. Am I correct? Thank you.
Matej Kristan (view profile)
Dear Khoa,
If I remember correctly, the SVD is used just to estimate the "rotation" of the covariance matrix (prewhitening), so any method that allows you to do that should be applicable, I suppose.
Khoa Tran (view profile)
Dear Matej,
Thank you so much for your work!
Is it safe to use the cholesky decomposition instead of SVD in your estimateBandwidth.m?
Best regards,
Khoa
Matej Kristan (view profile)
@Aparna: The bruteforce way is to discretize the pdf and find the maximum. The right way is to search by gradient ascent on the Gaussian Mixture Model. You can use the Mean Shift with the Gaussian kernel to do that. In fact, you can use my Matlab implementation that you can find as a part of the oKDE toolbox at my homepage (http://www.vicos.si/Research/Multivariate_Online_Kernel_Density_Estimation).
Aparna (view profile)
How can I find the max of the estimated pdf?
Matej Kristan (view profile)
Taylor. There is a demo function called "demoBW_with_preclustering.m" there you will find an example of how to contruct the pdf as well as how to visualize it. In short the line "kde = constructKDE(pdf.Mu, pdf.w, H, pdf.Cov) ; " is the line you're looking for in that function.
J Taylor (view profile)
What function returns the actual pdf? Obviously estimateBandwidth returns the bandwidths, but how do I obtain the estimated density function over some range for each dimension?
Matej Kristan (view profile)
Vinicius, the example is in the "demoBW_with_preclustering.m". On line 21 or so, there is an example of preclustering by calling "pdf = precluster( data ) ;".
Vinicius Mourao Alves de Souz (view profile)
I don't understand how I can input a preclustered data and obtain KDEs from these clusters. Can you give me an example? Thanks