Code covered by the BSD License  

Highlights from
Efficient K-Means Clustering using JIT

3.57143

3.6 | 7 ratings Rate this file 103 Downloads (last 30 days) File Size: 2.02 KB File ID: #19344
image thumbnail

Efficient K-Means Clustering using JIT

by Yi Cao

 

27 Mar 2008 (Updated 16 Apr 2008)

A simple but fast tool for K-means clustering

| Watch this File

File Information
Description

This is a tool for K-means clustering. After trying several different ways to program, I got the conclusion that using simple loops to perform distance calculation and comparison is most efficient and accurate because of the JIT acceleration in MATLAB.

The code is very simple and well documented, hence is suitable for beginners to learn k-means clustering algorithm.

Numerical comparisons show that this tool could be several times faster than kmeans in Statistics Toolbox.

Acknowledgements
This submission has inspired the following:
Patch color selector
MATLAB release MATLAB 7.5 (R2007b)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (8)
18 May 2008 nicola rebagliati

this stuff works and examples/comparisons are given

16 Mar 2009 Michael Chen  
05 Apr 2009 V. Poor  
08 Jul 2009 Edgar Kraft

The code is very nice and well documented. In some cases, however, the clusters are not properly identified if no initial centroid vectors are provided. This could be improved by automatically trying a small number of different random initial guesses and chosing the configuration which yields the smallest sum of distance between points and centroids.

17 Aug 2010 Nandha  
06 Apr 2011 Tim Benham

The function fails to terminate on some inputs. For example see http://snipt.org/wpkI

13 May 2011 Maxime

Pretty fast indeed!

However, the number of cluster is sometimes not respected. The algorithm yields a lower number of clusters, replacing additional centroid by NaN. This can be inconvenient.

13 May 2011 Maxime

Although not a perfect way to solve the above-mentioned issue, adding the following two lines after the update of the centroids solved the problem in my case:

idnan = find(isnan(c(:,1)));
c(idnan,:) = X(randi(n,length(idnan),1),:);

Please login to add a comment or rating.
Updates
27 Mar 2008

update description

16 Apr 2008

correct bugs in examples

Tag Activity for this File
Tag Applied By Date/Time
statistics Yi Cao 22 Oct 2008 09:55:22
probability Yi Cao 22 Oct 2008 09:55:22
kmeans Yi Cao 22 Oct 2008 09:55:22
jit Yi Cao 22 Oct 2008 09:55:22
clustering Yi Cao 22 Oct 2008 09:55:22
jit Johan 22 Oct 2009 05:06:22
clustering newpolaris ? 06 Sep 2010 00:51:26
kmeans Aaron 26 Nov 2010 13:15:43
clustering chahi.21 bechar 16 Jan 2011 15:05:46
kmeans chahi.21 bechar 16 Jan 2011 15:05:51
clustering Mohamed 09 Feb 2011 11:14:31
clustering Pierre 12 Aug 2011 18:06:36
kmeans Pierre 12 Aug 2011 18:06:44
jit rohit 02 Jan 2012 08:09:20

Contact us at files@mathworks.com