MATLAB Answers


K-means Clustering Result Always Changes

Asked by Alvi Syahrin on 4 May 2013

I'm working on k-means in MATLAB. Here are my codes:

load cobat.txt
k=input('Enter the number of cluster: ');
if k<8
    [cidx ctrs]=kmeans(cobat, k, 'dist', 'sqEuclidean');
    Z = [cobat cidx]
    h=msgbox('Must be less than eight');

"cobat" is the file of mine and here it looks:

65	80	55
45	75	78
36	67	66
65	78	88
79	80	72
77	85	65
76	77	79
65	67	88
85	76	88
56	76	65

My problem is everytime I run the code, it always shows different result, different cluster. How can I keep the clustering result always the same?


2 Answers

Answer by Walter Roberson
on 5 May 2013
 Accepted answer

%generate some initial cluster centers according to some deterministic algorithm
%in this case, I construct a space-diagonal equally spaced, but choose your
%own algorithm
minc = min(cobat, 1);
maxc = max(cobat, 1);
nsamp = size(cobat,1);
initialcenters = repmat(minc, nsamp, 1) + bsxfun(@times, (0:nsamp-1).', (maxc - minc) ./ (nsamp-1));
%Once you have constructed the initial centers, cluster using those centers
[cidx ctrs] = kmeans(cobat, k, 'dist', 'sqEuclidean', 'start', initialcenters);


Thank you for the answer.

I still get confused of this line:

initialcenters = repmat(minc, nsamp, 1) + bsxfun(@times, (0:nsamp-1).', (maxc - minc) ./ (nsamp-1));

Do you mean; I have to construct how to initial the central of the cluster in matlab? But it (k-means) picks randomly, no method. Then do you have any idea?

kmeans does not pick randomly if you pass the argument 'start' and a data matrix of initial cluster centroids.

That particular line constructs evenly-spaced points between the minimum and maximum values of each column -- similar to using linspace() but working on all the columns at once. The details are not really important: I just chose those centroids to have a bunch of centroids that were deterministically distributed around the entire space of values.

Ok. It's been fixed now, by changing some scripts. Thank you for the idea.

Answer by the cyclist
on 4 May 2013

K-means clustering uses randomness as part of the algorithm Try setting the seed of the random number generator before you start. If you have a relatively new version of MATLAB, you can do this with the rng() command. Put


at the beginning of your code.


Thank you for the answer. I have MATLAB 7.11.0(R2010b), and when I tried that command, it's not working, getting an error for undefined function. Do you have any idea to solve this?


>> doc randstream

to see how to do it in your version.

Join the 15-year community celebration.

Play games and win prizes!

Learn more
Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

MATLAB Academy

New to MATLAB?

Learn MATLAB today!