Confidence intervals/significance testing for dendrogram

1 view (last 30 days)
Is there a simple and/or straightforward way to generate confidence intervals or significance measures for each of the branches when using the dendrogram function in Matlab?
I am just using the standard code for generating the dendrogram:
Y=pdist(Matrix, 'euclidean'); Z=linkage(Y,'average'); [H,T]=dendrogram(z,20);
and would like to assess the clustering quality. From what I can tell, it would require some permutation-based inferences.
I was hoping for a built-in function, but can't find one.
There was a relatively recent publication doing this for gene research: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3023458/
"we introduce a permutation test based on comparing the within-cluster structure of the observed data with those of sample datasets obtained by permuting the cluster membership. We carry out this test at each node of the dendrogram using a statistic derived from the singular value decomposition of variance matrices. The p-values thus obtained provide insight into the significance of each cluster division"
Could this be easily implemented?
Thanks

Answers (1)

Sebastien De Landtsheer
Sebastien De Landtsheer on 1 Feb 2018
As far as I know, the standard way to assess support for nodes in a tree is to compute multiple trees from bootstrapped data and counting how often a given group of the original tree appears in the perturbed ones, then indicating the support on the original tree. I have some code doing that, I might write a function eventually.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!