Clustering - different size clusters

10 views (last 30 days)
I have a pretty large matrix of data which I want to cluster against the first column which can be separated into six clusters / categories of different sizes. I know the k means clustering algorithm allows input of number of clusters but allows those to be determined iteratively. Is there anything on MATLAB which would be suitable for my task?

Accepted Answer

Image Analyst
Image Analyst on 29 Oct 2015
Yes, silhouette() lets you graphically judge the quality of the clustering produced by kmeans(). evalclusters() lets to evaluate the quality of the clustering achieved with a range of k values so you can pick the right k if you don't know it for certain.
% Try values of k 2 through 5
clustev = evalclusters(X, 'kmeans', 'silhouette', 'KList', 2:5);
% Get the best one value for k:
kBest = clustev.OptimalK
  6 Comments
Bran
Bran on 6 Nov 2015
Thank you very much Image Analyst for all your help and advice. I've been looking at the various features offered by MATLAB and it is very useful. Just a final quick question, does MATLAB have a Mann-Whitney test that also accounts for clusters? For example comparing the distribution of two groups that may have several clusters within them?
Image Analyst
Image Analyst on 6 Nov 2015
This is all I could find:
p = ranksum(x,y) returns the p-value of a two-sided Wilcoxon rank sum test. ranksum tests the null hypothesis that data in x and y are samples from continuous distributions with equal medians, against the alternative that they are not. The test assumes that the two samples are independent. x and y can have different lengths. This test is equivalent to a Mann-Whitney U-test.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!