Fuzzy c-means clustering
2 clusters using fuzzy c-means clustering.
[centers,U] = fcm(fcmdata,2);
Iteration count = 1, obj. fcn = 8.970479 Iteration count = 2, obj. fcn = 7.197402 Iteration count = 3, obj. fcn = 6.325579 Iteration count = 4, obj. fcn = 4.586142 Iteration count = 5, obj. fcn = 3.893114 Iteration count = 6, obj. fcn = 3.810804 Iteration count = 7, obj. fcn = 3.799801 Iteration count = 8, obj. fcn = 3.797862 Iteration count = 9, obj. fcn = 3.797508 Iteration count = 10, obj. fcn = 3.797444 Iteration count = 11, obj. fcn = 3.797432 Iteration count = 12, obj. fcn = 3.797430
Classify each data point into the cluster with the largest membership value.
maxU = max(U); index1 = find(U(1,:) == maxU); index2 = find(U(2,:) == maxU);
Plot the clustered data and cluster centers.
plot(fcmdata(index1,1),fcmdata(index1,2),'ob') hold on plot(fcmdata(index2,1),fcmdata(index2,2),'or') plot(centers(1,1),centers(1,2),'xb','MarkerSize',15,'LineWidth',3) plot(centers(2,1),centers(2,2),'xr','MarkerSize',15,'LineWidth',3) hold off
Create a random data set.
data = rand(100,2);
Specify a large fuzzy partition matrix exponent to increase the amount of fuzzy ovrelap between the clusters.
options = [3.0 NaN NaN 0];
Cluster the data.
[centers,U] = fcm(data,2,options);
Load the clustering data.
Set the clustering termination conditions such that the optimization stops when either of the following occurs:
The number of iterations reaches a maximum of
The objective function improves by less than
0.001 between two consecutive iterations.
options = [NaN 25 0.001 0];
The first option is
NaN, which sets the fuzzy partition matrix exponent to its default value of
2. Setting the fourth option to
0 suppresses the objective function display.
Cluster the data.
[centers,U,objFun] = fcm(clusterdemo,3,options);
View the objective function vector to determine which termination condition stopped the clustering.
objFun = 54.7257 42.9867 42.8554 42.1857 39.0857 31.6814 28.5736 27.1806 20.7359 15.7147 15.4353 15.4306 15.4305
The optimization stopped because the objective function improved by less than
0.001 between the final two iterations.
data— Data set to be clustered
Data set to be clustered, specified as a matrix with Nd rows,
where Nd is the number of
data points. The number of columns in
equal to the data dimensionality.
Nc— Number of clusters
Number of clusters, specified as an integer greater than
options— Clustering options
Clustering options, specified as a vector with the following elements:
Exponent for the fuzzy partition matrix
If your data set is wide with a lot of overlap between potential clusters, then the calculated cluster centers might be very close to each other. In this case, each data point has approximately the same degree of membership in all clusters. To improve your clustering results, decrease this value, which limits the amount of fuzzy overlap during clustering.
For an example of fuzzy overlap adjustment, see Adjust Fuzzy Overlap in Fuzzy C-Means Clustering.
Maximum number of iterations, specified as a positive integer.
Minimum improvement in objective function between two consecutive iterations, specified as a positive scalar.
Information display toggle indicating whether to display the objective function value after each iteration, specified as one of the following:
If any element of
the default value for that option is used.
The clustering process stops when the maximum number of iterations is reached or when the objective function improvement between two consecutive iterations is less than the specified minimum.
centers— Cluster centers
Final cluster centers, returned as a matrix with
containing the coordinates of each cluster center. The number of columns
centers is equal to the dimensionality of
the data being clustered.
U— Fuzzy partition matrix
Fuzzy partition matrix, returned as a matrix with
and Nd columns. Element
the degree of membership of the jth data point
in the ith cluster. For a given data point, the
sum of the membership values for all clusters is one.
objFunc— Objective function values
Objective function values for each iteration, returned as a vector.
Fuzzy c-means (FCM) is a clustering method that allows each data point to belong to multiple clusters with varying degrees of membership.
FCM is based on the minimization of the following objective function
D is the number of data points.
N is the number of clusters.
m is fuzzy partition matrix exponent for controlling the degree of fuzzy overlap, with m > 1. Fuzzy overlap refers to how fuzzy the boundaries between clusters are, that is the number of data points that have significant membership in more than one cluster.
xi is the ith data point.
cj is the center of the jth cluster.
μij is the degree of membership of xi in the jth cluster. For a given data point, xi, the sum of the membership values for all clusters is one.
fcm performs the following steps during
Randomly initialize the cluster membership values, μij.
Calculate the cluster centers:
Update μij according to the following:
Calculate the objective function, Jm.
Repeat steps 2–4 until Jm improves by less than a specified minimum threshold or until after a specified maximum number of iterations.
 Bezdec, J.C., Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York, 1981.