Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

K-means Clustering Result Always Changes

Asked by Alvi Syahrin on 4 May 2013

I'm working on k-means in MATLAB. Here are my codes:

load cobat.txt
k=input('Enter the number of cluster: ');
if k<8
    [cidx ctrs]=kmeans(cobat, k, 'dist', 'sqEuclidean');
    Z = [cobat cidx]
else
    h=msgbox('Must be less than eight');
end

"cobat" is the file of mine and here it looks:

65	80	55
45	75	78
36	67	66
65	78	88
79	80	72
77	85	65
76	77	79
65	67	88
85	76	88
56	76	65

My problem is everytime I run the code, it always shows different result, different cluster. How can I keep the clustering result always the same?

0 Comments

Alvi Syahrin

2 Answers

Answer by Walter Roberson on 5 May 2013
Accepted answer
%generate some initial cluster centers according to some deterministic algorithm
%in this case, I construct a space-diagonal equally spaced, but choose your
%own algorithm
minc = min(cobat, 1);
maxc = max(cobat, 1);
nsamp = size(cobat,1);
initialcenters = repmat(minc, nsamp, 1) + bsxfun(@times, (0:nsamp-1).', (maxc - minc) ./ (nsamp-1));
%Once you have constructed the initial centers, cluster using those centers
[cidx ctrs] = kmeans(cobat, k, 'dist', 'sqEuclidean', 'start', initialcenters);

3 Comments

Alvi Syahrin on 5 May 2013

Thank you for the answer.

I still get confused of this line:

initialcenters = repmat(minc, nsamp, 1) + bsxfun(@times, (0:nsamp-1).', (maxc - minc) ./ (nsamp-1));

Do you mean; I have to construct how to initial the central of the cluster in matlab? But it (k-means) picks randomly, no method. Then do you have any idea?

Walter Roberson on 5 May 2013

kmeans does not pick randomly if you pass the argument 'start' and a data matrix of initial cluster centroids.

That particular line constructs evenly-spaced points between the minimum and maximum values of each column -- similar to using linspace() but working on all the columns at once. The details are not really important: I just chose those centroids to have a bunch of centroids that were deterministically distributed around the entire space of values.

Alvi Syahrin on 5 May 2013

Ok. It's been fixed now, by changing some scripts. Thank you for the idea.

Walter Roberson
Answer by the cyclist on 4 May 2013

K-means clustering uses randomness as part of the algorithm Try setting the seed of the random number generator before you start. If you have a relatively new version of MATLAB, you can do this with the rng() command. Put

rng(1)

at the beginning of your code.

2 Comments

Alvi Syahrin on 4 May 2013

Thank you for the answer. I have MATLAB 7.11.0(R2010b), and when I tried that command, it's not working, getting an error for undefined function. Do you have any idea to solve this?

the cyclist on 4 May 2013

Type

>> doc randstream

to see how to do it in your version.

the cyclist

Contact us