Silhouette plot

`silhouette(X,clust)`

s = silhouette(X,clust)

[s,h] = silhouette(X,clust)

[...] = silhouette(X,clust,* metric*)

[...] = silhouette(X,clust,distfun,p1,p2,...)

`silhouette(X,clust)`

plots
cluster silhouettes for the *n*-by-*p* data
matrix `X`

, with clusters defined by `clust`

.
Rows of `X`

correspond to points, columns correspond
to coordinates. `clust`

can be a categorical variable,
numeric vector, character matrix, or cell array of character vectors
containing a cluster name for each point. `silhouette`

treats `NaN`

s
or empty character vectors in `clust`

as missing
values, and ignores the corresponding rows of `X`

.
By default, `silhouette`

uses the squared Euclidean
distance between points in `X`

.

`s = silhouette(X,clust)`

returns
the silhouette values in the *n*-by-1 vector `s`

,
but does not plot the cluster silhouettes.

`[s,h] = silhouette(X,clust) `

plots
the silhouettes, and returns the silhouette values in the *n*-by-1
vector `s`

, and the figure handle in `h`

.

`[...] = silhouette(X,clust,`

plots
the silhouettes using the inter-point distance function specified
in * metric*)

`metric`

`metric`

Metric | Description |
---|---|

`'Euclidean'` | Euclidean distance |

`'sqEuclidean'` | Squared Euclidean distance (default) |

`'cityblock'` | Sum of absolute differences |

`'cosine'` | One minus the cosine of the included angle between points (treated as vectors) |

`'correlation'` | One minus the sample correlation between points (treated as sequences of values) |

`'Hamming'` | Percentage of coordinates that differ |

`'Jaccard'` | Percentage of nonzero coordinates that differ |

Vector | A numeric distance matrix in upper triangular vector
form, such as is created by |

For more information on each metric, see Distance Metrics.

`[...] = silhouette(X,clust,distfun,p1,p2,...)`

accepts
a function handle `distfun`

to a metric of the form

d = distfun(X0,X,p1,p2,...)

where `X0`

is a `1`

-by-`p`

point, `X`

is
an `n`

-by-`p`

matrix of points,
and `p1,p2,...`

are optional additional arguments.
The function `distfun`

returns an `n`

-by-`1`

vector `d`

of
distances between `X0`

and each point (row) in `X`

.
The arguments `p1`

, `p2`

,`...`

are
passed directly to the function `distfun`

.

[1] Kaufman L., and P. J. Rousseeuw. *Finding
Groups in Data: An Introduction to Cluster Analysis*. Hoboken,
NJ: John Wiley & Sons, Inc., 1990.

`dendrogram`

| `evalclusters`

| `kmeans`

| `linkage`

| `pdist`

Was this topic helpful?