Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

New to MATLAB?

K-means Clustering Result Always Changes

Asked by Alvi Syahrin

Alvi Syahrin (view profile)

on 4 May 2013

I'm working on k-means in MATLAB. Here are my codes:

load cobat.txt
k=input('Enter the number of cluster: ');
if k<8
    [cidx ctrs]=kmeans(cobat, k, 'dist', 'sqEuclidean');
    Z = [cobat cidx]
else
    h=msgbox('Must be less than eight');
end

"cobat" is the file of mine and here it looks:

65	80	55
45	75	78
36	67	66
65	78	88
79	80	72
77	85	65
76	77	79
65	67	88
85	76	88
56	76	65

My problem is everytime I run the code, it always shows different result, different cluster. How can I keep the clustering result always the same?

0 Comments

Alvi Syahrin

Alvi Syahrin (view profile)

2 Answers

Answer by Walter Roberson

Walter Roberson (view profile)

on 5 May 2013
Accepted answer
%generate some initial cluster centers according to some deterministic algorithm
%in this case, I construct a space-diagonal equally spaced, but choose your
%own algorithm
minc = min(cobat, 1);
maxc = max(cobat, 1);
nsamp = size(cobat,1);
initialcenters = repmat(minc, nsamp, 1) + bsxfun(@times, (0:nsamp-1).', (maxc - minc) ./ (nsamp-1));
%Once you have constructed the initial centers, cluster using those centers
[cidx ctrs] = kmeans(cobat, k, 'dist', 'sqEuclidean', 'start', initialcenters);

3 Comments

Alvi Syahrin

Alvi Syahrin (view profile)

on 5 May 2013

Thank you for the answer.

I still get confused of this line:

initialcenters = repmat(minc, nsamp, 1) + bsxfun(@times, (0:nsamp-1).', (maxc - minc) ./ (nsamp-1));

Do you mean; I have to construct how to initial the central of the cluster in matlab? But it (k-means) picks randomly, no method. Then do you have any idea?

Walter Roberson

Walter Roberson (view profile)

on 5 May 2013

kmeans does not pick randomly if you pass the argument 'start' and a data matrix of initial cluster centroids.

That particular line constructs evenly-spaced points between the minimum and maximum values of each column -- similar to using linspace() but working on all the columns at once. The details are not really important: I just chose those centroids to have a bunch of centroids that were deterministically distributed around the entire space of values.

Alvi Syahrin

Alvi Syahrin (view profile)

on 5 May 2013

Ok. It's been fixed now, by changing some scripts. Thank you for the idea.

Walter Roberson

Walter Roberson (view profile)

Answer by the cyclist

the cyclist (view profile)

on 4 May 2013

K-means clustering uses randomness as part of the algorithm Try setting the seed of the random number generator before you start. If you have a relatively new version of MATLAB, you can do this with the rng() command. Put

rng(1)

at the beginning of your code.

2 Comments

Alvi Syahrin

Alvi Syahrin (view profile)

on 4 May 2013

Thank you for the answer. I have MATLAB 7.11.0(R2010b), and when I tried that command, it's not working, getting an error for undefined function. Do you have any idea to solve this?

the cyclist

the cyclist (view profile)

on 4 May 2013

Type

>> doc randstream

to see how to do it in your version.

the cyclist

the cyclist (view profile)

Contact us