## Fuzzy Clustering

Clustering of numerical data forms the basis of many classification and system-modeling algorithms. The purpose of clustering is to identify natural groupings of data from a large data set to produce a concise representation of system behavior. Fuzzy Logic Toolbox™ tools allow you to find clusters in input-output training data using:

Fuzzy c-means clustering — Each data point belongs to each cluster to a degree that is specified by a fuzzy membership grade.

Subtractive clustering — A fast nonfuzzy algorithm for estimating the number of clusters in a data set.

### Fuzzy C-Means Clustering

*Fuzzy c-means* (FCM) is a data clustering technique where each
data point belongs to a cluster to a degree that is specified by a membership
grade.

The FCM algorithm starts with an initial guess for the cluster centers, which represent the mean location of each cluster. The initial guess for these cluster centers is most likely incorrect. Additionally, FCM assigns every data point a membership grade for each cluster. By iteratively updating the cluster centers and the membership grades for each data point, the algorithm iteratively moves the cluster centers to the optimal location within a data set. This iteration is based on minimizing an objective function that represents the distance from any given data point to a cluster center weighted by the data point membership grade.

To cluster data using FCM clustering, use the `fcm`

function. To specify the clustering algorithm options, use an
`fcmOptions`

object. The `fcm`

function outputs a list of cluster centers and
cluster membership grades for each data point.

You can use cluster information to generate a fuzzy inference system that best
models the data behavior using a minimum number of rules. The rules partition
themselves according to the fuzzy qualities associated with each of the data
clusters. To automatically generate this type of FIS, use the `genfis`

function. You can generate an initial fuzzy system using
clustering results from either FCM or subtractive clustering.

To cluster data, specify an array of data points, **x**_{i}, with
*N* rows. The number of columns for each data point is equal to
the data dimensionality.

$${x}_{j}={\left[\begin{array}{cccc}{x}_{j1}& {x}_{j2}& \cdots & {x}_{jn}\end{array}\right]}^{\top},\text{\hspace{1em}}1\le j\le N$$

The FCM algorithm computes the cluster centers, **c**_{i}. This array
contains one row for each cluster center and the number of columns matches the
number of columns in **x**_{i}.

$${c}_{i}={\left[\begin{array}{cccc}{c}_{i1}& {c}_{i2}& \cdots & {c}_{in}\end{array}\right]}^{\top},\text{\hspace{1em}}1\le i\le C$$

To specify the number of clusters *C*, use the
`NumClusters`

option.

The FCM algorithm minimizes the following objective function.

$${J}_{m}={\displaystyle \sum _{i=1}^{C}{\displaystyle \sum _{j=1}^{N}{\mu}_{ij}^{m}{D}_{ij}^{2}}}$$

Here:

*m*is the fuzzy partition matrix exponent for controlling the degree of fuzzy overlap, with*m*> 1. Fuzzy overlap refers to how fuzzy the boundaries between clusters are, that is, the number of data points that have significant membership in more than one cluster. To specify fuzzy partition matrix exponent, use the`Exponent`

option.*D*is the distance from the_{ij}*j*th data point to the*i*th cluster.*μ*is the degree of membership of the_{ij}*j*th data point in the*i*th cluster. For a given data point, the sum of the membership values for all clusters is one.

The `fcm`

function supports three types of FCM clustering.
These methods differ based on the distance metric used for computing
*D _{ij}*. To select an FCM algorithm,
set the

`DistanceMetric`

option.FCM Algorithm | `DistanceMetric` Value | Description |
---|---|---|

Classical FCM | `"euclidean"` | Compute distances using the Euclidean distance norm, which assumes a spherical shape for all clusters. [1] |

Gustafson-Kessel FCM | `"mahalanobis"` | Compute distances using a Mahalanobis distance norm, where cluster covariance is weighted by cluster membership values. This method is useful for detecting nonspherical clusters. [4] |

Gath-Geva FCM | `"fmle"` | Compute distances using an exponential distance measure based on fuzzy maximum likelihood estimation. This method is useful for detecting nonshperical clusters in the presence of variable cluster densities and unequal numbers of data points in each cluster. [5] |

FCM clustering performs the following steps.

Set the initial cluster centers. By default, the initial cluster centers are random. You can specify initial cluster centers using the

`ClusterCenters`

option.Randomly initialize the cluster membership values

*μ*._{ij}Calculate the cluster centers.

$${c}_{i}=\frac{{\displaystyle \sum _{j=1}^{N}{\mu}_{ij}^{m}{x}_{i}}}{{\displaystyle \sum _{j=1}^{N}{\mu}_{ij}^{m}}},\text{\hspace{1em}}1\le i\le C$$

Compute the distance metric for each data point based on the clustering algorithm.

Update the membership values for each data point.

$${\mu}_{ij}=\frac{1}{{\displaystyle \sum _{k=1}^{N}{\left(\frac{{D}_{ij}}{{D}_{ik}}\right)}^{\frac{2}{m-1}}}},\text{\hspace{1em}}1\le i\le C,\text{\hspace{1em}}1\le j\le N$$

Calculate the objective function

*J*._{m}Repeat steps 3–6 until one of the following conditions is met.

*J*improves by less than a specified minimum threshold, which you specify using the_{m}`MinImprovement`

option.The algorithm performs the specified maximum number of iterations, which you specify using the

`MaxNumIteration`

option.

#### Classical FCM Distance Calculation

The classical FCM algorithm computes the Euclidean distance from each data point to each cluster center, as shown in the following equation.

$${D}_{ij}=\sqrt{{\left({x}_{j}-{c}_{i}\right)}^{\top}\left({x}_{j}-{c}_{i}\right)},\text{\hspace{1em}}1\le i\le C,\text{\hspace{1em}}1\le j\le N$$

#### Gustafson-Kessel Distance Calculation

*Since R2023a*

The Gustafson-Kessel (GK) FCM algorithm first computes covariance matrices
**F**_{i} for each cluster
center.

$$\begin{array}{l}{F}_{i}=\frac{{\displaystyle \sum _{j=1}^{N}{\mu}_{ij}^{m}\left({x}_{j}-{c}_{i}\right){\left({x}_{j}-{c}_{i}\right)}^{\top}}}{{S}_{i}},\text{\hspace{1em}}1\le i\le C\\ {S}_{i}={\displaystyle \sum _{j=1}^{N}{\mu}_{ij}^{m}}\end{array}$$

Then, it computes the Mahalanobis distance from each data point to each cluster using the covariance matrices.

$${D}_{ij}=\sqrt{{\left({x}_{j}-{c}_{i}\right)}^{\top}\left[\mathrm{det}{\left({F}_{i}\right)}^{1/N}{F}_{i}^{-1}\right]\left({x}_{j}-{c}_{i}\right)},\text{\hspace{1em}}1\le i\le C,\text{\hspace{1em}}1\le j\le N$$

#### Gath-Geva Distance Calculation

*Since R2023b*

The Gath-Geva (GG) FCM algorithm first computes covariance matrices **F**_{i} for
each cluster center.

$$\begin{array}{l}{F}_{i}=\frac{{\displaystyle \sum _{j=1}^{N}{\mu}_{ij}^{m}\left({x}_{j}-{c}_{i}\right){\left({x}_{j}-{c}_{i}\right)}^{\top}}}{{S}_{i}},\text{\hspace{1em}}1\le i\le C\\ {S}_{i}={\displaystyle \sum _{j=1}^{N}{\mu}_{ij}^{m}}\end{array}$$

Then, it computes the prior probability for selecting each cluster [6].

$${P}_{i}=\frac{{S}_{i}}{{\displaystyle \sum _{i=1}^{C}{S}_{i}}},\text{\hspace{1em}}1\le i\le C$$

Finally, it computes the distance from each data point to each cluster using an exponential distance measure.

$$\begin{array}{l}{D}_{ij}={A}_{i}\cdot \mathrm{exp}\left(0.5{\displaystyle \sum _{j=1}^{N}{\left({x}_{j}-{c}_{i}\right)}^{\top}{F}_{i}^{-1}\left({x}_{j}-{c}_{i}\right)}\right),\text{\hspace{1em}}1\le i\le C,\text{\hspace{1em}}1\le j\le N\\ {A}_{i}=\frac{\sqrt{\mathrm{det}\left({F}_{i}\right)}}{{P}_{i}}\end{array}$$

### Subtractive Clustering

If you do not have a clear idea how many clusters there should be for a given set
of data, *subtractive clustering* is a fast, one-pass algorithm
for estimating the number of clusters and the cluster centers for a set of data
[2]. To obtain the cluster estimates, use the
`subclust`

function. You can use the cluster
estimates to initialize iterative optimization-based clustering methods, such as
FCM, and model identification methods, such as ANFIS. The
`subclust`

function finds the clusters using the subtractive
clustering method.

Subtractive clustering assumes that each data point is a potential cluster center. Based on that assumption, the clustering algorithm includes the following steps.

Calculate the likelihood that each data point would define a cluster center, based on the density of surrounding data points.

Choose the data point with the highest potential to be the first cluster center.

Remove all data points near the first cluster center based on a specified cluster influence range.

Choose the remaining point with the highest potential as the next cluster center.

Repeat steps 3 and 4 until all the data is within the influence range of a cluster center.

The subtractive clustering method is an extension of the mountain clustering method proposed in [3].

### References

[1] Bezdek, James C. *Pattern Recognition
with Fuzzy Objective Function Algorithms*. Boston, MA: Springer
US, 1981. https://doi.org/10.1007/978-1-4757-0450-1.

[2] Chiu, Stephen L. “Fuzzy Model Identification Based
on Cluster Estimation.” *Journal of Intelligent and
Fuzzy Systems* 2, no. 3 (1994): 267–78. https://doi.org/10.3233/IFS-1994-2306.

[3] Yager, Ronald R., and
Dimitar P. Filev. “Generation of Fuzzy Rules by Mountain Clustering.” *Journal of Intelligent and Fuzzy Systems* 2, no. 3
(1994): 209–19. https://doi.org/10.3233/IFS-1994-2306.

[4] Gustafson, Donald, and
William Kessel. “Fuzzy Clustering with a Fuzzy Covariance Matrix.” In *1978 IEEE Conference on Decision and Control Including the
17th Symposium on Adaptive Processes*, 761–66. San Diego, CA,
USA: IEEE, 1978. https://doi.org/10.1109/CDC.1978.268028.

[5] Gath, I., and A.B.
Geva. “Unsupervised Optimal Fuzzy Clustering.” *IEEE
Transactions on Pattern Analysis and Machine Intelligence* 11,
no. 7 (July 1989): 773–80. https://doi.org/10.1109/34.192473.

[6] Höppner, Frank, ed.
*Fuzzy Cluster Analysis: Methods for Classification,
Data Analysis, and Image Recognition*. Chichester ; New York: J.
Wiley, 1999: 49–53.

## See Also

`fcm`

| `fcmOptions`

| `subclust`

| `genfis`