# Documentation

### This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

# cluster

Construct agglomerative clusters from linkages

## Syntax

```T = cluster(Z,'cutoff',c) T = cluster(Z,'cutoff',c,'depth',d) T = cluster(Z,'cutoff',c,'criterion',criterion) T = cluster(Z,'maxclust',n) ```

## Description

`T = cluster(Z,'cutoff',c)` constructs clusters from the agglomerative hierarchical cluster tree, `Z`, as generated by the `linkage` function. `Z` is a matrix of size (`m` – 1)-by-3, where `m` is the number of observations in the original data. `c` is a threshold for cutting `Z` into clusters. Clusters are formed when a node and all of its subnodes have `inconsistent` value less than `c`. All leaves at or below the node are grouped into a cluster. `t` is a vector of size `m` containing the cluster assignments of each observation.

If `c` is a vector, `T` is a matrix of cluster assignments with one column per cutoff value.

`T = cluster(Z,'cutoff',c,'depth',d)` evaluates inconsistent values by looking to a depth `d` below each node. The default depth is `2`.

`T = cluster(Z,'cutoff',c,'criterion',criterion)` uses the specified criterion for forming clusters, where `criterion` is `'inconsistent'` (default) or `'distance'`. The `'distance'` criterion uses the distance between the two subnodes merged at a node to measure node height. All leaves at or below a node with height less than `c` are grouped into a cluster.

`T = cluster(Z,'maxclust',n)` constructs a maximum of `n` clusters using the `'distance'` criterion. `cluster` finds the smallest height at which a horizontal cut through the tree leaves `n` or fewer clusters.

If `n` is a vector, `T` is a matrix of cluster assignments with one column per maximum value.

## Examples

collapse all

Load the sample data.

```load fisheriris ```

Compute four clusters of the Fisher iris data using Ward linkage and ignoring species information.

```Z = linkage(meas,'ward','euclidean'); c = cluster(Z,'maxclust',4); ```

See how the cluster assignments correspond to the three species.

```crosstab(c,species) ```
```ans = 0 25 1 0 24 14 0 1 35 50 0 0 ```

Display the first five rows of Z.

```firstfive = Z(1:5,:) ```
```firstfive = 102.0000 143.0000 0 8.0000 40.0000 0.1000 1.0000 18.0000 0.1000 10.0000 35.0000 0.1000 129.0000 133.0000 0.1000 ```

Create a dendrogram plot of `Z` .

```dendrogram(Z) ```

Randomly generate the sample data with 20000 observations.

```rng default; % For reproducibility X = rand(20000,3); ```

Create a hierarchical cluster tree using Ward's linkage.

```Z = linkage(X,'ward','euclidean','savememory','on'); ```

If you set `savememory` to `'off'` , you can get an out-of-memory error if your machine doesn't have enough memory to hold the distance matrix.

Cluster data into four groups and plot the result.

```c = cluster(Z,'maxclust',4); scatter3(X(:,1),X(:,2),X(:,3),10,c) ```

## See Also

#### Introduced before R2006a

Was this topic helpful?

Download now