Main Content

Fuzzy c-means clustering

To generate a fuzzy inference system using FCM clustering, use the

`genfis`

command. For example, suppose you cluster your data using the following syntax:[centers,U] = fcm(data,Nc,options);

where the first

`M`

columns of`data`

correspond to input variables, and the remaining columns correspond to output variables.You can generate a fuzzy system using the same training data and FCM clustering configuration. To do so:

Configure clustering options.

`opt = genfisOptions('FCMClustering'); opt.NumClusters = Nc; opt.Exponent = options(1); opt.MaxNumIteration = options(2); opt.MinImprovement = options(3); opt.Verbose = options(4);`

Extract the input and output variable data.

inputData = data(:,1:M); outputData = data(:,M+1:end);

Generate the FIS structure.

fis = genfis(inputData,outputData,opt);

The fuzzy system,

`fis`

, contains one fuzzy rule for each cluster, and each input and output variable has one membership function per cluster. For more information, see`genfis`

and`genfisOptions`

.

Fuzzy c-means (FCM) is a clustering method that allows each data point to belong to multiple clusters with varying degrees of membership.

FCM is based on the minimization of the following objective function

$${J}_{m}={\displaystyle \sum _{i=1}^{D}{\displaystyle \sum _{j=1}^{N}{\mu}_{ij}^{m}{\Vert {x}_{i}-{c}_{j}\Vert}^{2}}},$$

where

*D*is the number of data points.*N*is the number of clusters.*m*is fuzzy partition matrix exponent for controlling the degree of fuzzy overlap, with*m*> 1. Fuzzy overlap refers to how fuzzy the boundaries between clusters are, that is the number of data points that have significant membership in more than one cluster.*x*is the_{i}*i*th data point.*c*is the center of the_{j}*j*th cluster.*μ*is the degree of membership of_{ij}*x*in the_{i}*j*th cluster. For a given data point,*x*, the sum of the membership values for all clusters is one._{i}

`fcm`

performs the following steps during
clustering:

Randomly initialize the cluster membership values,

*μ*._{ij}Calculate the cluster centers:

$${c}_{j}=\frac{{\displaystyle \sum _{i=1}^{D}{\mu}_{ij}^{m}{x}_{i}}}{{\displaystyle \sum _{i=1}^{D}{\mu}_{ij}^{m}}}.$$

Update

*μ*according to the following:_{ij}$${\mu}_{ij}=\frac{1}{{\displaystyle \sum _{k=1}^{N}{\left(\frac{\Vert {x}_{i}-{c}_{j}\Vert}{\Vert {x}_{i}-{c}_{k}\Vert}\right)}^{\frac{2}{m-1}}}}.$$

Calculate the objective function,

*J*._{m}Repeat steps 2–4 until

*J*improves by less than a specified minimum threshold or until after a specified maximum number of iterations._{m}

[1] Bezdek, J.C., *Pattern Recognition with Fuzzy Objective Function
Algorithms*, Plenum Press, New York, 1981.