Asked by Alvi Syahrin
on 4 May 2013

I'm working on k-means in MATLAB. Here are my codes:

load cobat.txt k=input('Enter the number of cluster: '); if k<8 [cidx ctrs]=kmeans(cobat, k, 'dist', 'sqEuclidean'); Z = [cobat cidx] else h=msgbox('Must be less than eight'); end

"cobat" is the file of mine and here it looks:

65 80 55 45 75 78 36 67 66 65 78 88 79 80 72 77 85 65 76 77 79 65 67 88 85 76 88 56 76 65

My problem is everytime I run the code, it always shows different result, different cluster. How can I keep the clustering result always the same?

Answer by Walter Roberson
on 5 May 2013

Accepted answer

%generate some initial cluster centers according to some deterministic algorithm %in this case, I construct a space-diagonal equally spaced, but choose your %own algorithm

minc = min(cobat, 1); maxc = max(cobat, 1); nsamp = size(cobat,1); initialcenters = repmat(minc, nsamp, 1) + bsxfun(@times, (0:nsamp-1).', (maxc - minc) ./ (nsamp-1));

%Once you have constructed the initial centers, cluster using those centers

[cidx ctrs] = kmeans(cobat, k, 'dist', 'sqEuclidean', 'start', initialcenters);

Alvi Syahrin
on 5 May 2013

Thank you for the answer.

I still get confused of this line:

initialcenters = repmat(minc, nsamp, 1) + bsxfun(@times, (0:nsamp-1).', (maxc - minc) ./ (nsamp-1));

Do you mean; I have to construct how to initial the central of the cluster in matlab? But it (k-means) picks randomly, no method. Then do you have any idea?

Walter Roberson
on 5 May 2013

kmeans does not pick randomly if you pass the argument 'start' and a data matrix of initial cluster centroids.

That particular line constructs evenly-spaced points between the minimum and maximum values of each column -- similar to using linspace() but working on all the columns at once. The details are not really important: I just chose those centroids to have a bunch of centroids that were deterministically distributed around the entire space of values.

Alvi Syahrin
on 5 May 2013

Ok. It's been fixed now, by changing some scripts. Thank you for the idea.

Answer by the cyclist
on 4 May 2013

K-means clustering uses randomness as part of the algorithm Try setting the seed of the random number generator before you start. If you have a relatively new version of MATLAB, you can do this with the rng() command. Put

rng(1)

at the beginning of your code.

Alvi Syahrin
on 4 May 2013

Thank you for the answer. I have MATLAB 7.11.0(R2010b), and when I tried that command, it's not working, getting an error for undefined function. Do you have any idea to solve this?

the cyclist
on 4 May 2013

Type

>> doc randstream

to see how to do it in your version.

Related Content

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn moreOpportunities for recent engineering grads.

Apply Today
## 0 Comments