Main Content

Spectral clustering

partitions observations in the `idx`

= spectralcluster(`X`

,`k`

)*n*-by-*p* data matrix
`X`

into `k`

clusters using the spectral clustering
algorithm (see Algorithms).
`spectralcluster`

returns an *n*-by-1 vector
`idx`

containing cluster indices of each observation.

returns a vector of cluster indices for `idx`

= spectralcluster(`S`

,`k`

,`'Distance'`

,'precomputed')`S`

, the similarity matrix (or adjacency matrix) of a similarity graph. `S`

can be the output of `adjacency`

.

To use a similarity matrix as the first input, you must specify
`'Distance','precomputed'`

.

specifies additional options using one or more name-value pair arguments in addition to
the input arguments in previous syntaxes. For example, you can specify
`idx`

= spectralcluster(___,`Name,Value`

)`'SimilarityGraph','epsilon'`

to construct a similarity graph using the
radius search method.

`[`

also returns the eigenvectors `idx`

,`V`

] = spectralcluster(___)`V`

corresponding to the
`k`

smallest eigenvalues of the Laplacian
matrix.

Consider using spectral clustering when the clusters in your data do not naturally correspond to convex regions.

From the spectral clustering algorithm, you can estimate the number of clusters

`k`

as:For an example, see Estimate Number of Clusters and Perform Spectral Clustering.

Spectral clustering is a graph-based algorithm for clustering data points (or observations
in `X`

). The algorithm involves constructing a graph, finding its Laplacian matrix, and
using this matrix to find *k* eigenvectors to split the graph
*k* ways. By default, the algorithm for
`spectralcluster`

computes the normalized random-walk Laplacian matrix
using the method described by Shi-Malik [2].
`spectralcluster`

also supports the unnormalized Laplacian matrix and the
normalized symmetric Laplacian matrix which uses the Ng-Jordan-Weiss method [3].
`spectralcluster`

implements clustering as follows:

For each data point in

`X`

, define a local neighborhood using either the radius search method or nearest neighbor method, as specified by the`'SimilarityGraph'`

name-value pair argument (see Similarity Graph). Then, find the pairwise distances $$Dis{t}_{i,j}$$ for all points*i*and*j*in the neighborhood.Convert the distances to similarity measures using the kernel transformation $${S}_{i,j}=\mathrm{exp}\left(-{\left(\frac{Dis{t}_{i,j}}{\sigma}\right)}^{2}\right)$$. The matrix

*S*is the similarity matrix, and*σ*is the scale factor for the kernel, as specified using the`'KernelScale'`

name-value pair argument.Calculate the unnormalized Laplacian matrix

*L*, the normalized random-walk Laplacian matrix*L*, or the normalized symmetric Laplacian matrix_{rw}*L*, depending on the value of the_{s}`'LaplacianNormalization'`

name-value pair argument.Create a matrix $$V\in {\mathbb{R}}^{n\times k}$$ containing columns $${v}_{1},\dots ,{v}_{k}$$, where the columns are the

*k*eigenvectors that correspond to the*k*smallest eigenvalues of the Laplacian matrix. If using*L*, normalize each row of_{s}*V*to have unit length.Treating each row of

*V*as a point, cluster the*n*points using*k*-means clustering (default) or*k*-medoids clustering, as specified by the`'ClusterMethod'`

name-value pair argument.Assign the original points in

`X`

to the same clusters as their corresponding rows in*V*.

[2] Shi, J., and J. Malik.
“Normalized cuts and image segmentation.” *IEEE Transactions on
Pattern Analysis and Machine Intelligence*. Vol. 22, 2000, pp.
888–905.

[3] Ng, A.Y., M. Jordan, and Y. Weiss.
“On spectral clustering: Analysis and an algorithm.” In *Proceedings
of the Advances in Neural Information Processing Systems 14*. MIT Press, 2001,
pp. 849–856.