K-means without iteration

Question

Manuel on 28 Feb 2013

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/65328-k-means-without-iteration

Hello,

My problem is that it is difficult to get the optimal cluster number by using k-means, so I thought of using a hierarchical algorithm to find the optimal cluster number. After defining my ideal classification I want to use this classification to find the centroids with k-means, without iteration.

if true
 data= rand(300,5);
  D = pdist(data);
  Z = linkage(D,'ward');
  T = cluster(Z,'maxclust',6);
end

Now I want to use the clusters defined in vector T and the positions in to k-means algorithm without iterations. Can anyone give a tip how to do?

Thank you.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Tom Lane on 28 Feb 2013

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/65328-k-means-without-iteration#answer_76931

Not sure I follow exactly, but you could use grpstats to compute the coordinatewise means of data for each distinct value of T. You could use pdist2 to compute the distance from each data point to each centroid. You could use min to figure out which centroid is closest to each point (note the second output of the min command).

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

Manuel on 1 Mar 2013

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/65328-k-means-without-iteration#answer_77045

Open in MATLAB Online

Thank you for your answer.

I tried the grpstats command but the clusters is not exactly the same when apply the same data to confirm. I made a copy of my code if you want to try:

data= rand(200,5); data_test =data(100:125,:); Y = pdist(data); Z = linkage(Y,'ward'); T = cluster(Z,'maxclust',6);

 [means ]=grpstats(data, T);
 D= pdist2(data_test,means );
 [C,I] = min(D,[],2);
 matrix= [T(100:125) I];

do you have any other suggestion?

Regards

1 Comment
Show -1 older commentsHide -1 older comments

Tom Lane on 4 Mar 2013

Open in MATLAB Online

I would not expect the hierarchical and k-means results to match. Even though you're using Ward's linkage which is based on distances to centroids, the centroids shift around as the clustering progresses. The I value you computed is the result intended to simulate the "k-means without iteration" process you requested.

Here's an attempt to show what is going on. We have each point clustered using hierarchical and k-means clustering, with a voronoi diagram superimposed. The k-means values match the voronoi regions, but the hierarchical values sometimes do not.

data = rand(200,2); 
Y = pdist(data);
Z = linkage(Y,'ward');
T = cluster(Z,'maxclust',6);
means = grpstats(data, T);
D = pdist2(data,means );
[C,I] = min(D,[],2);
gscatter(data(:,1),data(:,2),T)                  % hierarchical clusters
hold on
gscatter(data(:,1),data(:,2),I,[],'o',10)        % k-means assignments
gscatter(means(:,1),means(:,2),(1:6)',[],'+',20) % centroids
voronoi(means(:,1),means(:,2))                   % voronoi regions
hold off

Sign in to comment.

K-means without iteration

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

K-means without iteration

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments