Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
kmeans set size

Subject: kmeans set size

From: b b

Date: 7 Apr, 2008 17:52:01

Message: 1 of 2

I am working with the kmeans function to cluster my (x,y)
data points into groups, and using these groups to create a
convex hull for each cluster. The problem is, the clusters
often result in groups that have only two members, or three
members, such that the third member is the same as the first
(to create a closed polygon).

I was wondering if there is a way to set the minimum number
of elements in a cluster for the kmeans algorithm to avoid
having sets with only two or three members? If not, then is
there another clustering algorithm in Matlab 2007a that
could be used instead?

Thanks in advance!

Bruce.

Subject: kmeans set size

From: Peter Perkins

Date: 8 Apr, 2008 19:20:55

Message: 2 of 2

b b wrote:
> I am working with the kmeans function to cluster my (x,y)
> data points into groups, and using these groups to create a
> convex hull for each cluster. The problem is, the clusters
> often result in groups that have only two members, or three
> members, such that the third member is the same as the first
> (to create a closed polygon).
>
> I was wondering if there is a way to set the minimum number
> of elements in a cluster for the kmeans algorithm to avoid
> having sets with only two or three members? If not, then is
> there another clustering algorithm in Matlab 2007a that
> could be used instead?

Bruce, there's nothing preventing you from post-processing the clusters
to merge very small ones with other clusters. Use the centroids to find
the closest.

You also may simply be using too large a value for K. On the other
hand, people sometimes use a large K on purpose to find outliers.

You can use hierarchical clustering (see LINKAGE) and define your own
clusters of a minimum size, but that algorithm requires a distance
matrix, and thus uses more memory than KMEANS.

Hope this helps.

- Peter Perkins
   The MathWorks, Inc.

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us