A Matlab toolbox for investigating the application of cluster ensembles to data classification.



Co-authors: Vincent De Sapio and Philip Kegelmeyer

This is a Matlab toolbox for investigating the application of cluster ensembles to data classification, with the objective of improving the accuracy and/or speed of clustering. The toolbox divides the cluster ensemble problem into four areas, providing functionality for each. These include, (1) synthetic data generation, (2) clustering to generate individual data partitions and similarity matrices, (3) consensus function generation and final clustering to generate ensemble data partitioning, and (4) implementation of accuracy metrics.

With regard to data generation, Gaussian data of arbitrary dimension can be generated. The kcenters algorithm can then be used to generate individual data partitions by either, (a) subsampling the data and clustering each subsample, or by (b) randomly initializing the algorithm and generating a clustering for each initialization. In either case an overall similarity matrix can be computed using a consensus function operating on the individual similarity matrices. A final clustering can be performed and performance metrics are provided for evaluation purposes.

Ram (view profile)

Where can I find affinprob.m file? Thanks in advance.

Sen Xu

Sen Xu (view profile)

Could you please send the kcenters.m and affinprop.m to


Edzel (view profile)

The file demo.m also calls on a function affinprop.m (lines 289 and 339) which is not available.

The kcenters.m file was not included in this submission since it was not written by me. The "Other requirements" section indicates where it may be found (Dueck, Frey:

Lee Newman

Posted file seems to be missing kcenters.m which gets called on line 47 of demo.m



An update was made to indicate the contribution of an additional author.

