Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

cluster

Construct agglomerative clusters from linkages

Syntax

T = cluster(Z,'cutoff',c)
T = cluster(Z,'cutoff',c,'depth',d)
T = cluster(Z,'cutoff',c,'criterion',criterion)
T = cluster(Z,'maxclust',n)

Description

T = cluster(Z,'cutoff',c) constructs clusters from the agglomerative hierarchical cluster tree, Z, as generated by the linkage function. Z is a matrix of size (m – 1)-by-3, where m is the number of observations in the original data. c is a threshold for cutting Z into clusters. Clusters are formed when a node and all of its subnodes have inconsistent value less than c. All leaves at or below the node are grouped into a cluster. t is a vector of size m containing the cluster assignments of each observation.

If c is a vector, T is a matrix of cluster assignments with one column per cutoff value.

T = cluster(Z,'cutoff',c,'depth',d) evaluates inconsistent values by looking to a depth d below each node. The default depth is 2.

T = cluster(Z,'cutoff',c,'criterion',criterion) uses the specified criterion for forming clusters, where criterion is 'inconsistent' (default) or 'distance'. The 'distance' criterion uses the distance between the two subnodes merged at a node to measure node height. All leaves at or below a node with height less than c are grouped into a cluster.

T = cluster(Z,'maxclust',n) constructs a maximum of n clusters using the 'distance' criterion. cluster finds the smallest height at which a horizontal cut through the tree leaves n or fewer clusters.

If n is a vector, T is a matrix of cluster assignments with one column per maximum value.

Examples

collapse all

Load the sample data.

load fisheriris

Compute four clusters of the Fisher iris data using Ward linkage and ignoring species information.

Z = linkage(meas,'ward','euclidean');
c = cluster(Z,'maxclust',4);

See how the cluster assignments correspond to the three species.

crosstab(c,species)
ans =

     0    25     1
     0    24    14
     0     1    35
    50     0     0

Display the first five rows of Z.

firstfive = Z(1:5,:)
firstfive =

  102.0000  143.0000         0
    8.0000   40.0000    0.1000
    1.0000   18.0000    0.1000
   10.0000   35.0000    0.1000
  129.0000  133.0000    0.1000

Create a dendrogram plot of Z .

dendrogram(Z)

Randomly generate the sample data with 20000 observations.

rng default; % For reproducibility
X = rand(20000,3);

Create a hierarchical cluster tree using Ward's linkage.

Z = linkage(X,'ward','euclidean','savememory','on');

If you set savememory to 'off' , you can get an out-of-memory error if your machine doesn't have enough memory to hold the distance matrix.

Cluster data into four groups and plot the result.

c = cluster(Z,'maxclust',4);
scatter3(X(:,1),X(:,2),X(:,3),10,c)

Introduced before R2006a

Was this topic helpful?