Confidence intervals/significance testing for dendrogram

7 views (last 30 days)
Is there a simple and/or straightforward way to generate confidence intervals or significance measures for each of the branches when using the dendrogram function in Matlab?
I am just using the standard code for generating the dendrogram:
Y=pdist(Matrix, 'euclidean'); Z=linkage(Y,'average'); [H,T]=dendrogram(z,20);
and would like to assess the clustering quality. From what I can tell, it would require some permutation-based inferences.
I was hoping for a built-in function, but can't find one.
There was a relatively recent publication doing this for gene research: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3023458/
"we introduce a permutation test based on comparing the within-cluster structure of the observed data with those of sample datasets obtained by permuting the cluster membership. We carry out this test at each node of the dendrogram using a statistic derived from the singular value decomposition of variance matrices. The p-values thus obtained provide insight into the significance of each cluster division"
Could this be easily implemented?
Thanks

Answers (1)

Sebastien De Landtsheer
Sebastien De Landtsheer on 1 Feb 2018
As far as I know, the standard way to assess support for nodes in a tree is to compute multiple trees from bootstrapped data and counting how often a given group of the original tree appears in the perturbed ones, then indicating the support on the original tree. I have some code doing that, I might write a function eventually.

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!