The silhouette value for each point is a measure of how similar that point is to points in its
own cluster, when compared to points in other clusters. The silhouette value
Si
for the i
th point is defined as
where ai
is the average distance from the i
th
point to the other points in the same cluster as i
,
and bi
is the minimum average distance from the i
th
point to points in a different cluster, minimized over clusters.
The silhouette value ranges from –1
to 1
. A high
silhouette value indicates that i
is well matched to its own
cluster, and poorly matched to other clusters. If most points have a high silhouette
value, then the clustering solution is appropriate. If many points have a low or
negative silhouette value, then the clustering solution might have too many or too
few clusters. You can use silhouette values as a clustering evaluation criterion
with any distance metric.