Flexible mixture models for automatic clustering

Version 0.81 (119 KB) by Statovic

Matlab implementation of clustering (i.e., finite mixture models, unsupervised classification).

5.0

(3)

747 Downloads

Updated 2 Mar 2023

View License

Snob is a MATLAB implementation of finite mixture models of univariate and multivariate distributions. Snob uses the minimum message length (MML) criterion to estimate the structure of the mixture model (i.e., the number of sub-populations; which sample belongs to which sub-population) and estimate all mixture model parameters. For larger sample sizes, the MML criterion is equivalent to the popular Bayesian information criterion (BIC) and this possess all the favourable properties of BIC, such as model selection consistency. However, for smaller sample sizes, MML takes into account the parametric complexity of the model and does not simply count the number of free parameters which generally results in improved performance over BIC.

Snob allows the user to specify the desired number of sub-populations, however if this is not specified, Snob will automatically try to discover this information using the MML criterion. Currently, Snob supports mixtures of the following distributions:

-Beta distribution

-Dirichlet distribution

-Exponential distribution

-Exponential distribution with type I censoring

-Gamma distribution

-Geometric distribution

-Inverse Gaussian distribution

-Laplace distribution

-Gaussian linear regression

-Logistic regression

-Lognormal distribution

-Multinomial distribution

-Multivariate Gaussian distribution (general covariance structure)

-Multivariate Gaussian distribution (single factor analysis; principal component analysis)

-Negative binomial distribution

-Normal distribution

-Pareto distribution (Type II)

-Poisson distribution

-von Mises-Fisher distribution

-Weibull distribution

-Weibull distribution with Type I censoring

The program is easy to use and allows missing data which should be coded as NaN. Examples of how to use the program are provided; please see data/mm_example?.m.

UPDATE VERSION 0.8.1 (02/03/2023):

Latest updates:

-added function minmis() to compute minimum number of misclassifications by label rotation

-updated examples

-fixed a minor bug in the 'skip' model

Cite As

Wallace, C. S. & Dowe, D. L. MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing, 2000 , 10, pp. 73-83

Wallace, C. S. Intrinsic Classification of Spatially Correlated Data. The Computer Journal, 1998, 41, pp. 602-611

Wallace, C. S. Statistical and Inductive Inference by Minimum Message Length. Springer, 2005

Schmidt, D. F. & Makalic, E. Minimum Message Length Inference and Mixture Modelling of Inverse Gaussian Distributions. AI 2012: Advances in Artificial Intelligence, Springer Berlin Heidelberg, 2012, 7691, pp. 672-682

Edwards, R. T. & Dowe, D. L. Single factor analysis in MML mixture modelling. Research and Development in Knowledge Discovery and Data Mining, Second Pacific-Asia Conference (PAKDD-98), 1998, 1394

MATLAB Release Compatibility

Created with R2021b

Compatible with any release

Platform Compatibility

Windows macOS Linux

Tags Add Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

data

Version	Published	Release Notes
0.81	2 Mar 2023	-added function minmis() to compute the minimum number of misclassifications by label rotation -updated examples -fixed a minor bug in the 'skip' model	Download
0.80	23 Feb 2023	-added mixtures of principal component analyzers and a new example	Download
0.75	24 Mar 2022	-added Pareto (Type II) mixture models -added an example of fitting Pareto mixtures	Download
0.70	22 Mar 2022	-added mixtures of Dirichlet distributions and a new example	Download
0.65	11 Mar 2022	-added mixtures of exponential and Weibull models with type I (right) censoring -added more examples -fixed a numerical issue with fitting mixtures of linear regressions	Download
0.60	24 Nov 2021	-added the lognormal distribution -fixed typos in some of the documentation	Download
0.50	9 Nov 2021	-added mixture models for censored exponential and Weibull distributions -added new function to compute Kullback-Leibler divergences for a mixture model -improved documentation -added new examples of usage -added ability to name attributes	Download
0.40	1 Mar 2021	-added beta and von Mises Fisher distributions -improvements to numerical accuracy -updated documentation and examples	Download
0.30	18 Sep 2019	-significant speed improvement in gamma, Laplace mixture models -added logistic regression, negative binomial distribution -added a new example(s)	Download
0.2.2	31 Jul 2019	-Added mixtures of Laplace distributions -Improved documentation -Improved output of summary function -Added more examples	Download
0.2.1	30 Jul 2019	-Minor title change -Fixed some typos in the description	Download
0.2.0	30 Jul 2019		Download