I have question in relation to k means clustering. Say I created two clusters from data. For example using this code:
X = [randn(100,2)+ones(100,2);... randn(100,2)-ones(100,2)]; opts = statset('Display','final');
[idx,ctrs] = kmeans(X,2,... 'Distance','city',... 'Replicates',5,... 'Options',opts); plot(X(idx==1,1),X(idx==1,2),'r.','MarkerSize',12) hold on plot(X(idx==2,1),X(idx==2,2),'b.','MarkerSize',12) plot(ctrs(:,1),ctrs(:,2),'kx',... 'MarkerSize',12,'LineWidth',2) plot(ctrs(:,1),ctrs(:,2),'ko',... 'MarkerSize',12,'LineWidth',2) legend('Cluster 1','Cluster 2','Centroids',... 'Location','NW')
My question is, if you collect more data can you assign it to each of the two clusters that have already been formed, or do you have to cluster all of the data again?
If it is possible, how would you do it?
No products are associated with this question.
k-means is an unsupervised learning algorithm that is sensitive to the number of clusters you choose AND to the initial start centers. I would say that you would need to cluster the data again.