Package: clustering.evaluation
Superclasses: clustering.evaluation.ClusterCriterion
Gap criterion clustering evaluation object
clustering.evaluation.GapEvaluation
is an
object consisting of sample data, clustering data, and gap criterion
values used to evaluate the optimal number of clusters. Create a gap
criterion clustering evaluation object using evalclusters
.
creates
a gap criterion clustering evaluation object.eva
= evalclusters(x
,clust
,'Gap')
creates
a gap criterion clustering evaluation object using additional options
specified by one or more namevalue pair arguments.eva
= evalclusters(x
,clust
,'Gap',Name,Value
)

Number of data sets generated from the reference distribution, stored as a positive integer value. 

Clustering algorithm used to cluster the input data, stored
as a valid clustering algorithm name or function handle. If the clustering
solutions are provided in the input, 

Name of the criterion used for clustering evaluation, stored as a valid criterion name. 

Criterion values corresponding to each proposed number of clusters
in 

Distance measure used for clustering data, stored as a valid distance measure name. 

Expectation of the natural logarithm of W based
on the generated reference data, stored as a vector of scalar values. W is
the withincluster dispersion computed using the distance measurement 

List of the number of proposed clusters for which to compute criterion values, stored as a vector of positive integer values. 

Natural logarithm of W based on the input
data, stored as a vector of scalar values. W is
the withincluster dispersion computed using the distance measurement 

Logical flag for excluded data, stored as a column vector of
logical values. If 

Number of observations in the data matrix 

Optimal number of clusters, stored as a positive integer value. 

Optimal clustering solution corresponding to 

Reference data generation method, stored as a valid reference distribution name. 

Standard error of the natural logarithm of W with
respect to the reference data for each number of clusters in 

Method for determining the optimal number of clusters, stored as a valid search method name. 

Standard deviation of the natural logarithm of W with
respect to the reference data for each number of clusters in 

Data used for clustering, stored as a matrix of numerical values. 
increaseB  Increase reference data sets 
addK  Evaluate additional numbers of clusters 
compact  Compact clustering evaluation object 
plot  Plot clustering evaluation object criterion values 
[1] Tibshirani, R., G. Walther, and T. Hastie. "Estimating the number of clusters in a data set via the gap statistic." Journal of the Royal Statistical Society: Series B. Vol. 63, Part 2, 2001, pp. 411–423.