How can I color my dendrogram plot such that the colors correspond to clusters generated by the CLUSTER function in the Statistics Toolbox?

13 views (last 30 days)
I generate a dendrogram plot by running the following code at the MATLAB prompt:
NumCluster = 3;
rand('state', 7)
data = [rand(10,3); rand(10,3)+1; rand(10,3)+2];
dist = pdist(data, 'euclidean');
link = linkage(dist, 'complete');
clust = cluster(link, 'maxclust', NumCluster);
[H,T,perm] = dendrogram(link, 0);
I would like different sections of the dendrogram plot colored such that they correspond to the clusters returned by the CLUSTER function.

Accepted Answer

MathWorks Support Team
MathWorks Support Team on 15 Apr 2015
You cannot specify the coloring in DENDROGRAM to match the clusters returned by CLUSTER in the Statistics Toolbox. To work around this limitation, you can use the "colorthreshold" option in the DENDROGRAM function as follows:
NumCluster = 3;
rand('state', 7)
data = [rand(10,3); rand(10,3)+1; rand(10,3)+2];
dist = pdist(data, 'euclidean');
link = linkage(dist, 'complete');
clust = cluster(link, 'maxclust', NumCluster);
color = link(end-NumCluster+2,3)-eps;
[H,T,perm] = dendrogram(link, 0, 'colorthreshold', color);
The above code will work for any values of "NumCluster" set to 2 or higher. The idea is to use the distance information returned by the LINKAGE function to identify a distance cut-off point such that coloring the clusters on the dendrogram plot below that point will result in the desired coloring effect. Since the distance information is returned in the third colomn of the "link" variable in ascending order, you can see that the value of "color" is set just below the line that would break the dendrogram plot into "NumCluster" clusters.
NOTE: The above code might not work well in situations with many repeated distance values returned in the "link" variable. This code is only provided as a guideline, and you should modify it as necessary to fit a given problem.
  1 Comment
Cam Salzberger
Cam Salzberger on 15 Apr 2016
Hello Denise,
That's a good suggestion. I do not believe that there is currently an easy way to do this. I have submitted an enhancement request for this functionality, so we may see it in a future release of MATLAB.
You can check the 'Color' property of the lines in the first output of "dendrogram". This would at least give you which color options there are. The lines appear to have been drawn from top-down on the plot, so the last entry in "H" is the top-most line-segment. If you can organize the clusters by which branches off first, you may be able to work out which lines correspond to which nodes. It's a tricky proposition though.
-Cam

Sign in to comment.

More Answers (0)

Products


Release

R14SP2

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!