View License

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video

Highlights from
Fast Kernel Density Estimator (Multivariate)

Be the first to rate this file! 34 Downloads (last 30 days) File Size: 32.3 KB File ID: #41187 Version: 1.7
image thumbnail

Fast Kernel Density Estimator (Multivariate)



09 Apr 2013 (Updated )

A very fast multivariate bandwidth calculation for KDE that can even be calculated from a GMM.

| Watch this File

File Information

The code implements an approximation of the multivariate bandwidth calculation from [1]. In contrast to other multivariate bandwidth estimators, it can be estimated from a preclustered sample distribution offering a simple way of estimating compact and accurate KDEs with variable kernels.

The code provides a C source code for the engine of calculation and a routine to compile it automatically in Matlab.

The code includes three demos:
1. Multivariate KDE: demoBW_Estimation.m (it also compiles your code)
2. 1D KDE: demoBW_Estimation1D.m
3. Multivariate KDE with preclustering: demoBW_with_preclustering

Reasons to use the bandwidth estimator from [1]:
* Reasonably fast computation
* Handles multivariate bandwidths
* Can use weighted data
* Generally produces good estimates of the bandwidths
* Can be calculated from a Gaussian mixture model, not only directly from the samples
* Avoids numerical evaluations and iterative computation -- the bandwidth is analytically computed (even from a GMM) under some approximations.

Some advice:
If you're trying to estimate the KDE from "really" large datasets, then I suggest one of two things: (i) precluster the data first and apply [1]. (2) Use the online KDE, which learns the model by one data at a time -- the Matlab code for the oKDE is available from the author's homepage (

[1] M. Kristan, A. Leonardis, D. Skočaj, "Multivariate online Kernel Density Estimation with Gaussian Kernels", Pattern Recognition, 2011.

MATLAB release MATLAB 7.12 (R2011a)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (12)
16 Dec 2015 Matej Kristan

Dear Matthew,

It is important to correctly specify what you mean with "data stored in separate vectors". I understand that you would like to estimate the joint distribution in 3D space over the three variables. Obviously you have data stored in three separate vectors. But is the "i"-th value in each of the vectors obtained at the same measuring instance? If that is the case, just concatenate the vectors in 3x1000 matrix and give it to the KDE (make sure you transpose it correctly) and you're done -- you'll get the joint pdf. But if the three measurement vectors are independent, then you can estimate only the marginals. I.e., a 1D pdf over each individual measurement vector. If you can assume that the measurements are indeed independent, then the joint pdf will be just the product of the marginals. Given your large number of points, you can perform a preclustering (the code allows it if I remember correctly). Alternative is to use the oKDE -- an online kernel density estimator that can learn the distribution from thousands of datapoints and will produce a model with low complexity. The oKDE is available from my homepage.

Comment only
15 Dec 2015 Matthew

Dear Matej,

Thank you for uploading this for the community. I have a question and I was hoping you could help me a little. I want to use KDE to obtain the joint probability distribution function of three continuous random variables, my data is stored in vectors of 1000 elements each, is it possible to estimate the joint PDF of them using your program? Maybe I'm trying to use it incorrectly.

Thank you!

Comment only
23 Nov 2015 Matej Kristan

Dear Douglas,

Since you have data x separated from y, you can only estimate the marginals, which means that you assume independent pdfs in x and y. I.e., a 1D pdf on x, p(x), and a 1D pdf on y, p(y). So evaluation of the joint probability p(x,y) under the independence assumption that you implicitly make becomes p(x,y)=p(x)p(y). This means that if you want to compute the probabiltiy of (x1,y1), you will get this by p(x1)*p(y1).

Comment only
23 Nov 2015 Douglas

Matej Kristan, tkx for the code. Please, check if you can help me. Imagine that x and y are vectors and each one has 100 elements. So, I want to estimate the joint pdf of x and y, that is, pdf=dist(x,y). But, I want with this pdf the probability density of combinations of (x,y) that are not in the x and y used to estimate the distribution. Is it possibile with you code? I think this can be done using the function "visualizePdf2d2" and adjusting the "gran" tag. Am I correct? Thank you.

Comment only
12 Aug 2015 Matej Kristan

Dear Khoa,

If I remember correctly, the SVD is used just to estimate the "rotation" of the covariance matrix (prewhitening), so any method that allows you to do that should be applicable, I suppose.

Comment only
12 Aug 2015 Khoa Tran

Dear Matej,

Thank you so much for your work!

Is it safe to use the cholesky decomposition instead of SVD in your estimateBandwidth.m?

Best regards,

Comment only
20 Jan 2014 Matej Kristan

@Aparna: The brute-force way is to discretize the pdf and find the maximum. The right way is to search by gradient ascent on the Gaussian Mixture Model. You can use the Mean Shift with the Gaussian kernel to do that. In fact, you can use my Matlab implementation that you can find as a part of the oKDE toolbox at my homepage (

Comment only
19 Jan 2014 Aparna

Aparna (view profile)

How can I find the max of the estimated pdf?

Comment only
25 Oct 2013 Matej Kristan

Taylor. There is a demo function called "demoBW_with_preclustering.m" there you will find an example of how to contruct the pdf as well as how to visualize it. In short the line "kde = constructKDE(pdf.Mu, pdf.w, H, pdf.Cov) ; " is the line you're looking for in that function.

Comment only
24 Oct 2013 J Taylor

What function returns the actual pdf? Obviously estimateBandwidth returns the bandwidths, but how do I obtain the estimated density function over some range for each dimension?

Comment only
27 Sep 2013 Matej Kristan

Vinicius, the example is in the "demoBW_with_preclustering.m". On line 21 or so, there is an example of preclustering by calling "pdf = precluster( data ) ;".

Comment only
27 Sep 2013 Vinicius Mourao Alves de Souz

I don't understand how I can input a pre-clustered data and obtain KDEs from these clusters. Can you give me an example? Thanks

Comment only
10 Apr 2013 1.3

Some correction of the code for moment matching "momentMatchPdf.m" and additional example of very fast 1D KDE estimation "demoBW_Estimation1D.m".

11 Apr 2013 1.4

updated the "Description" field.

16 Apr 2013 1.5

added link to homepage

08 May 2013 1.7

Corrected plotting of a 1d distribution and some includes for properly installing the path to plotting tools.

Contact us