| Statistics Toolbox™ | ![]() |
T = clusterdata(X,cutoff)
T = clusterdata(X,param1,val1,param2,val2,...)
T = clusterdata(X,cutoff) uses the pdist, linkage, and cluster functions to construct clusters from data X. X is an m-by-n matrix, treated as m observations of n variables. cutoff is a threshold for cutting the hierarchical tree generated by linkage into clusters. When 0 < cutoff < 2, clusterdata forms clusters when inconsistent values are greater than cutoff (see inconsistent). When cutoff is an integer ≥ 2, clusterdata interprets cutoff as the maximum number of clusters to keep in the hierarchical tree generated by linkage. The output T is a vector of size m containing a cluster number for each observation.
When 0 < cutoff < 2, T = clusterdata(X,cutoff) is equivalent to:
Y = pdist(X,'euclid'); Z = linkage(Y,'single'); T = cluster(Z,'cutoff',cutoff);
When cutoff is an integer ≥ 2, T = clusterdata(X,cutoff) is equivalent to:
Y = pdist(X,'euclid'); Z = linkage(Y,'single'); T = cluster(Z,'maxclust',cutoff);
T = clusterdata(X,param1,val1,param2,val2,...) provides more control over the clustering through a set of parameter/value pairs. Valid parameters are
| 'distance' | Any of the distance metric names allowed by pdist (follow the 'minkowski' option by the value of the exponent p) |
| 'linkage' | Any of the linkage methods allowed by the linkage function |
| 'cutoff' | Cutoff for inconsistent or distance measure |
| 'maxclust' | Maximum number of clusters to form |
| 'criterion' | Either 'inconsistent' or 'distance' |
| 'depth' | Depth for computing inconsistent values |
The example first creates a sample data set of random numbers. It then uses clusterdata to compute the distances between items in the data set and create a hierarchical cluster tree from the data set. Finally, the clusterdata function groups the items in the data set into three clusters. The example uses the find function to list all the items in cluster 2, and the scatter3 function to plot the data with each cluster shown in a different color.
rand('state',12);
X = [rand(10,3); rand(10,3)+1.2; rand(10,3)+2.5];
T = clusterdata(X,'maxclust',3);
find(T==2)
ans =
11
11
13
14
15
16
17
18
19
20
scatter3(X(:,1),X(:,2),X(:,3),100,T,'filled')

cluster, inconsistent, kmeans, linkage, pdist
![]() | cluster | cmdscale | ![]() |
| © 1984-2008- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |