| Contents | Index |
T = clusterdata(X,cutoff)
T = clusterdata(X,Name,Value)
T = clusterdata(X,Name,Value) clusters with additional options specified by one or more Name,Value pair arguments.
The centroid and median methods can produce a cluster tree that is not monotonic. This occurs when the distance from the union of two clusters, r and s, to a third cluster is less than the distance between r and s. In this case, in a dendrogram drawn with the default orientation, the path from a leaf to the root node takes some downward steps. To avoid this, use another method. The following image shows a nonmonotonic cluster tree.

In this case, cluster 1 and cluster 3 are joined into a new cluster, while the distance between this new cluster and cluster 2 is less than the distance between cluster 1 and cluster 3. This leads to a nonmonotonic tree.
You can provide the output T to other functions including dendrogram to display the tree, cluster to assign points to clusters, inconsistent to compute inconsistent measures, and cophenet to compute the cophenetic correlation coefficient.
X |
Matrix with two or more rows. The rows represent observations, the columns represent categories or dimensions. |
cutoff |
When 0 < cutoff < 2, clusterdata forms clusters when inconsistent values are greater than cutoff (see inconsistent). When cutoff is an integer ≥ 2, clusterdata interprets cutoff as the maximum number of clusters to keep in the hierarchical tree generated by linkage. |
Specify optional comma-separated pairs of Name,Value arguments, where Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.
'criterion' |
Either 'inconsistent' or 'distance'. | ||||||||||||||||||||||||||
'cutoff' |
Cutoff for inconsistent or distance measure, a positive scalar. When 0 < cutoff < 2, clusterdata forms clusters when inconsistent values are greater than cutoff (see inconsistent). When cutoff is an integer ≥ 2, clusterdata interprets cutoff as the maximum number of clusters to keep in the hierarchical tree generated by linkage. | ||||||||||||||||||||||||||
'depth' |
Depth for computing inconsistent values, a positive integer. | ||||||||||||||||||||||||||
'distance' |
Any of the distance metric names allowed by pdist (follow the 'minkowski' option by the value of the exponent p):
| ||||||||||||||||||||||||||
'linkage' |
Any of the linkage methods allowed by the linkage function:
For details, see the definitions in the linkage function reference page. |
'maxclust' |
Maximum number of clusters to form, a positive integer. |
The example first creates a sample data set of random numbers. It then uses clusterdata to compute the distances between items in the data set and create a hierarchical cluster tree from the data set. Finally, the clusterdata function groups the items in the data set into three clusters. The example uses the find function to list all the items in cluster 2, and the scatter3 function to plot the data with each cluster shown in a different color.
X = [gallery('uniformdata',[10 3],12);...
gallery('uniformdata',[10 3],13)+1.2;...
gallery('uniformdata',[10 3],14)+2.5];
T = clusterdata(X,'maxclust',3);
find(T==2)
ans =
11
12
13
14
15
16
17
18
19
20
scatter3(X(:,1),X(:,2),X(:,3),100,T,'filled')

Create a hierarchical cluster tree for a data with 20000 observations using Ward's linkage. If you set savememory to 'off', you can get an out-of-memory error if your machine doesn't have enough memory to hold the distance matrix.
X = rand(20000,3);
c = clusterdata(X,'linkage','ward','savememory','on',...
'maxclust',4);
scatter3(X(:,1),X(:,2),X(:,3),10,c)

cluster | inconsistent | kmeans | linkage | pdist
| © 1984-2012- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |