Outputting the number of clusters found by linkage or dendrogram functions
Show older comments
% for example
X = rand(20000,3);
Z = linkage(X,'ward');
dendrogram(Z);
c = cluster(Z,'Maxclust',{input number of clusters here});
I am trying to find the number of clusters found by the linkage/dendrogram function so I can input the number into the cluster function in order to get cluster vectors which I can then crosstab.
Answers (1)
Do either the second or third output argument from the dendrogram function give you the information you're looking for?
X = rand(20000,3);
Z = linkage(X,'ward');
[tree, T, outperm] = dendrogram(Z);
whos T outperm
T lists which data points are in each leaf and outperm is the vector of leaf node labels. From the documentation page, "If there are P leaves in the dendrogram plot, outperm is a permutation of the vector 1:P."
numberOfLeafNodes = max(outperm)
Let's see how many points are contained in each leaf node.
[countsPerLeafNode, edges] = histcounts(T, BinMethod="integers")
I'll create a table to summarize the results (and just show the first few rows.)
results = table(edges(1:end-1).'+0.5, countsPerLeafNode.', ... % +0.5 to get bin centers
'VariableNames', ["Node number", "Count"]);
head(results)
Or if you want a picture:
histogram(T, BinMethod="integers")
1 Comment
Jackson Morgan
on 17 Aug 2023
Edited: Jackson Morgan
on 17 Aug 2023
Categories
Find more on Hierarchical Clustering in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

