Documentation

This is machine translation

Translated by Microsoft
Mouse over text to see original. Click the button below to return to the English verison of the page.

Cluster Visualization and Evaluation

Plot clusters of data and evaluate optimal number of clusters

Cluster analysis organizes data into groups based on similarities between the data points. Sometimes the data contains natural divisions that indicate the appropriate number of clusters. Other times, the data does not contain natural divisions, or the natural divisions are unknown. In such a case, you might to determine the optimal number of clusters to group your data.

To determine how well the data fits into a particular number of clusters, compute index values using different evaluation criteria, such as gap or silhouette. Visualize clusters by creating a dendrogram plot to display a hierarchical binary cluster tree. Optimize the leaf order to maximize the sum of the similarities between adjacent leaves. For grouped data with multiple measurements for each group, create a dendrogram plot based on the group means computed using a multivariate analysis of variance (MANOVA).

Functions

dendrogram Dendrogram plot
optimalleaforder Optimal leaf ordering for hierarchical clustering
manovacluster Dendrogram of group mean clusters following MANOVA
silhouette Silhouette plot
evalclusters Evaluate clustering solutions
addK Evaluate additional numbers of clusters
compact Compact clustering evaluation object
increaseB Increase reference data sets
plot Plot clustering evaluation object criterion values

Classes

CalinskiHarabaszEvaluation Calinski-Harabasz criterion clustering evaluation object
DaviesBouldinEvaluation Davies-Bouldin criterion clustering evaluation object
GapEvaluation Gap criterion clustering evaluation object
SilhouetteEvaluation Silhouette criterion clustering evaluation object
Was this topic helpful?