optimalleaforder - Determine optimal leaf ordering for hierarchical binary cluster tree

Syntax

Order = optimalleaforder(Tree, Dist)

Order = optimalleaforder(Tree, Dist, ...'Criteria', CriteriaValue, ...)
Order = optimalleaforder(Tree, Dist, ...'Transformation', TransformationValue, ...)

Arguments

TreeHierarchical binary cluster tree represented by an (M - 1)-by-3 matrix, created by the linkage function, where M is the number of leaves.
DistDistance matrix, such as that created by the pdist function.
CriteriaValueString that specifies the optimization criteria. Choices are:
  • adjacent (default) — Minimizes the sum of distances between adjacent leaves.

  • group — Minimizes the sum of distances between every leaf and all other leaves in the adjacent cluster.

TransformationValue

Either of the following:

  • String that specifies the algorithm to transform the distances in Dist into similarity values. Choices are:

    • linear (default) — Similarity = max(all distances) - distance

    • quadratic — Similarity = (max(all distances) - distance)2

    • inverse — Similarity = 1/distance

  • A function handle created using @ to a function that transforms the distances in Dist into similarity values. The function is typically a monotonic decreasing function within the range of the distance values. The function must accept a vector input and return a vector of the same size.

Return Values

Order Optimal leaf ordering for the hierarchical binary cluster tree represented by Tree.

Description

Order = optimalleaforder(Tree, Dist) returns the optimal leaf ordering for the hierarchical binary cluster tree represented by Tree, an (M - 1)-by-3 matrix, created by the linkage function, where M is the number of leaves. Optimal leaf ordering of a binary tree maximizes the similarity between adjacent elements (clusters or leaves) by flipping tree branches, but without dividing the clusters. The input Dist is a distance matrix, such as that created by the pdist function.

Order = optimalleaforder(Tree, Dist, ...'PropertyName', PropertyValue, ...) calls optimalleaforder with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:


Order = optimalleaforder(Tree, Dist, ...'Criteria', CriteriaValue, ...)
specifies the optimization criteria.

Order = optimalleaforder(Tree, Dist, ...'Transformation', TransformationValue, ...) specifies the algorithm to transform the distances in Dist into similarity values. The transformation is necessary because optimalleaforder maximizes the similarity between adjacent elements, which is comparable to minimizing the sum of distances between adjacent elements.

Examples

  1. Use the rand function to create a 10-by-2 matrix of random values.

    X = rand(10,2);
  2. Use the pdist function to create a distance matrix containing the city block distances between the pairs of objects in matrix X.

    Dist = pdist(X,'cityblock');
  3. Use the linkage function to create a matrix, Tree, that represents a hierarchical binary cluster tree, from the distance matrix, Dist.

    Tree = linkage(Dist,'average');
  4. Use the optimalleaforder function to determine the optimal leaf ordering for the hierarchical binary cluster tree represented by Tree, using the distance matrix Dist.

    order = optimalleaforder(Tree,Dist)
    

References

[1] Bar-Joseph, Z., Gifford, D.K., and Jaakkola, T.S. (2001). Fast optimal leaf ordering for hierarchical clustering. Bioinformatics 17, Suppl 1:S22–9. PMID: 11472989.

See Also

Bioinformatics Toolbox function: clustergram

Statistics Toolbox functions: linkage, pdist

  


 © 1984-2008- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS