My problem is that it is difficult to get the optimal cluster number by using k-means, so I thought of using a hierarchical algorithm to find the optimal cluster number. After defining my ideal classification I want to use this classification to find the centroids with k-means, without iteration.
if true data= rand(300,5); D = pdist(data); Z = linkage(D,'ward'); T = cluster(Z,'maxclust',6); end
Now I want to use the clusters defined in vector T and the positions in to k-means algorithm without iterations. Can anyone give a tip how to do?
No products are associated with this question.
Not sure I follow exactly, but you could use grpstats to compute the coordinatewise means of data for each distinct value of T. You could use pdist2 to compute the distance from each data point to each centroid. You could use min to figure out which centroid is closest to each point (note the second output of the min command).
Thank you for your answer.
I tried the grpstats command but the clusters is not exactly the same when apply the same data to confirm. I made a copy of my code if you want to try:
data= rand(200,5); data_test =data(100:125,:); Y = pdist(data); Z = linkage(Y,'ward'); T = cluster(Z,'maxclust',6);
[means ]=grpstats(data, T);
D= pdist2(data_test,means );
[C,I] = min(D,,2);
matrix= [T(100:125) I];
do you have any other suggestion?