Asked by Alvi Syahrin
on 4 May 2013

I'm working on k-means in MATLAB. Here are my codes:

load cobat.txt k=input('Enter the number of cluster: '); if k<8 [cidx ctrs]=kmeans(cobat, k, 'dist', 'sqEuclidean'); Z = [cobat cidx] else h=msgbox('Must be less than eight'); end

"cobat" is the file of mine and here it looks:

65 80 55 45 75 78 36 67 66 65 78 88 79 80 72 77 85 65 76 77 79 65 67 88 85 76 88 56 76 65

My problem is everytime I run the code, it always shows different result, different cluster. How can I keep the clustering result always the same?

Answer by Walter Roberson
on 5 May 2013

Accepted answer

%generate some initial cluster centers according to some deterministic algorithm %in this case, I construct a space-diagonal equally spaced, but choose your %own algorithm

minc = min(cobat, 1); maxc = max(cobat, 1); nsamp = size(cobat,1); initialcenters = repmat(minc, nsamp, 1) + bsxfun(@times, (0:nsamp-1).', (maxc - minc) ./ (nsamp-1));

%Once you have constructed the initial centers, cluster using those centers

[cidx ctrs] = kmeans(cobat, k, 'dist', 'sqEuclidean', 'start', initialcenters);

Alvi Syahrin
on 5 May 2013

Thank you for the answer.

I still get confused of this line:

initialcenters = repmat(minc, nsamp, 1) + bsxfun(@times, (0:nsamp-1).', (maxc - minc) ./ (nsamp-1));

Do you mean; I have to construct how to initial the central of the cluster in matlab? But it (k-means) picks randomly, no method. Then do you have any idea?

Walter Roberson
on 5 May 2013

kmeans does not pick randomly if you pass the argument 'start' and a data matrix of initial cluster centroids.

That particular line constructs evenly-spaced points between the minimum and maximum values of each column -- similar to using linspace() but working on all the columns at once. The details are not really important: I just chose those centroids to have a bunch of centroids that were deterministically distributed around the entire space of values.

Alvi Syahrin
on 5 May 2013

Ok. It's been fixed now, by changing some scripts. Thank you for the idea.

Answer by the cyclist
on 4 May 2013

K-means clustering uses randomness as part of the algorithm Try setting the seed of the random number generator before you start. If you have a relatively new version of MATLAB, you can do this with the rng() command. Put

rng(1)

at the beginning of your code.

Alvi Syahrin
on 4 May 2013

Opportunities for recent engineering grads.

## 0 Comments