Documentation Center

  • Trial Software
  • Product Updates

increaseB

Class: clustering.evaluation.GapEvaluation
Package: clustering.evaluation

Increase reference data sets

Syntax

eva_out = increaseB(eva,nref)

Description

eva_out = increaseB(eva,nref) returns a gap criterion clustering evaluation object eva_out that uses the same evaluation criteria as the input object eva and an additional number of reference data sets as specified by nref.

Input Arguments

expand all

eva — Clustering evaluation dataclustering evaluation object

Clustering evaluation data, specified as a clustering evaluation object. Create a clustering evaluation object using evalclusters.

nref — Number of additional reference data setspositive integer value

Number of additional reference data sets, specified as a positive integer value.

Output Arguments

expand all

eva_out — Updated clustering evaluation dataclustering evaluation object

Updated clustering evaluation data, returned as a gap criterion clustering evaluation object. eva_out contains evaluation data obtained using the reference data sets from the input object eva plus a number of additional reference data sets as specified in nref.

increaseB updates the B property of the input object eva to reflect the increase in the number of reference data sets used to compute the gap criterion values. increaseB also updates the CriterionValues property with gap criterion values computed using the total number of reference data sets. increaseB might also update the OptimalK and OptimalY properties to reflect the optimal number of clusters and optimal clustering solution as determined using the total number of reference data sets. Additionally, increaseB might also update the LogW, ExpectedLogW, StdLogW, and SE properties.

Examples

expand all

Evaluate Clustering Solutions Using Additional Reference Data

Create a gap clustering evaluation object using evalclusters, then use increaseB to increase the number of reference data sets used to compute the gap criterion values.

Load the sample data.

load fisheriris;

The data contains length and width measurements from the sepals and petals of three species of iris flowers.

Cluster the flower measurement data using kmeans, and use the gap criterion to evaluate proposed solutions of one through five clusters. Use 50 reference data sets.

eva = evalclusters(meas,'kmeans','gap','klist',1:5,'B',50)
eva = 

  GapEvaluation with properties:

    NumObservations: 150
         InspectedK: [1 2 3 4 5]
    CriterionValues: [0.0848 0.5920 0.8750 1.0044 1.0462]
           OptimalK: 5

The clustering evaluation object eva contains data on each proposed clustering solution. The returned results indicate that the optimal number of clusters is five.

The value of the B property of eva shows 50 reference data sets.

eva.B
ans =

    50

Increase the number of reference data sets by 50, for a total of 100 sets.

eva = increaseB(eva,50)
eva = 

  GapEvaluation with properties:

    NumObservations: 150
         InspectedK: [1 2 3 4 5]
    CriterionValues: [0.0824 0.5899 0.8742 1.0044 1.0463]
           OptimalK: 4

The returned results now indicate that the optimal number of clusters is four.

The value of the B property of eva now shows 100 reference data sets.

eva.B
ans =

   100

See Also

Was this topic helpful?