Skip to Main Content Skip to Search
Home |   Select Country  Choose Country  |  Contact Us  |  Cart Store 
Create Account | Log In
Products & Services Solutions Academia Support User Community Company
spacer spacer spacer spacer spacer spacer

 

Fuzzy Logic Toolbox 2.2.10

Fuzzy C-Means Clustering for Iris Data

This demo illustrates how to use Fuzzy C-Means clustering for Iris dataset.

Contents

Load Data

The dataset is obtained from the data file 'iris.dat'. This dataset was collected by botanist Anderson and contains random samples of flowers belonging to three species of iris flowers setosa , versicolor , and virginica. For each of the species, 50 observations for sepal length, sepal width, petal length, and petal width are recorded.

The dataset is partitioned into three groups named setosa , versicolor , and virginica. This is shown in the following code snippet.

load iris.dat
setosa = iris((iris(:,5)==1),:);        % data for setosa
versicolor = iris((iris(:,5)==2),:);    % data for versicolor
virginica = iris((iris(:,5)==3),:);     % data for virginica
obsv_n = size(iris, 1);                 % total number of observations

Plot Data in 2-D

The data to be clustered is 4-dimensional data and represents sepal length, sepal width, petal length, and petal width. From each of the three groups(setosa, versicolor and virginica), two characteristics (for example, sepal length vs. sepal width) of the flowers are plotted in a 2-dimensional plot. This is done using the following code snippet.

Characteristics = {'sepal length','sepal width','petal length','petal w
idth'};
pairs = [1 2; 1 3; 1 4; 2 3; 2 4; 3 4];
h = figure;
for j = 1:6,
    x = pairs(j, 1);
    y = pairs(j, 2);
    subplot(2,3,j);
    plot([setosa(:,x) versicolor(:,x) virginica(:,x)],...
         [setosa(:,y) versicolor(:,y) virginica(:,y)], '.');
    xlabel(Characteristics{x});
    ylabel(Characteristics{y});
end

Setup Parameters

Next, the parameters required for Fuzzy C-Means clustering such as number of clusters, exponent for the partition matrix, maximum number of iterations and minimum improvement are defined and set. This are shown in the following code snippet.

cluster_n = 3;                          % Number of clusters
expo = 2.0;                             % Exponent for U
max_iter = 100;                         % Max. iteration
min_impro = 1e-6;                       % Min. improvement

Compute Clusters

Fuzzy C-Means clustering is an iterative process. First, the initial fuzzy partition matrix is generated and the initial fuzzy cluster centers are calculated. In each step of the iteration, the cluster centers and the membership grade point are updated and the objective function is minimized to find the best location for the clusters. The process stops when the maximum number of iterations is reached, or when the objective function improvement between two consecutive iterations is less than the minimum amount of improvement specified. This is shown in the following code snippet.

% initialize fuzzy partition
U = initfcm(cluster_n, obsv_n);
% plot the data if the figure window is closed
if ishghandle(h)
    figure(h);
else
    for j = 1:6,
        x = pairs(j, 1);
        y = pairs(j, 2);
        subplot(2,3,j);
        plot([setosa(:,x) versicolor(:,x) virginica(:,x)],...
             [setosa(:,y) versicolor(:,y) virginica(:,y)], '.');
        xlabel(Characteristics{x});
        ylabel(Characteristics{y});
    end
end
% iteration
for i = 1:max_iter,
    [U, center, obj] = stepfcm(iris, U, cluster_n, expo);
    fprintf('Iteration count = %d, obj. fcn = %f\n', i, obj);
    % refresh centers
    if i>1 && (abs(obj - lastobj) < min_impro)
        for j = 1:6,
            subplot(2,3,j);
            for k = 1:cluster_n,
                text(center(k, pairs(j,1)), center(k,pairs(j,2)), int2str(k)
, 'FontWeight', 'bold');
            end
        end
        break;
    elseif i==1
        for j = 1:6,
            subplot(2,3,j);
            for k = 1:cluster_n,
                text(center(k, pairs(j,1)), center(k,pairs(j,2)), int2str(k)
, 'color', [0.5 0.5 0.5]);
            end
        end
    end
    lastobj = obj;
end
Iteration count = 1, obj. fcn = 28971.732229
Iteration count = 2, obj. fcn = 21428.913712
Iteration count = 3, obj. fcn = 15907.394597
Iteration count = 4, obj. fcn = 9263.010879
Iteration count = 5, obj. fcn = 7062.630646
Iteration count = 6, obj. fcn = 6475.211882
Iteration count = 7, obj. fcn = 6249.471562
Iteration count = 8, obj. fcn = 6146.023339
Iteration count = 9, obj. fcn = 6097.858175
Iteration count = 10, obj. fcn = 6075.873149
Iteration count = 11, obj. fcn = 6066.095010
Iteration count = 12, obj. fcn = 6061.840414
Iteration count = 13, obj. fcn = 6060.018628
Iteration count = 14, obj. fcn = 6059.247071
Iteration count = 15, obj. fcn = 6058.922679
Iteration count = 16, obj. fcn = 6058.786940
Iteration count = 17, obj. fcn = 6058.730317
Iteration count = 18, obj. fcn = 6058.706745
Iteration count = 19, obj. fcn = 6058.696944
Iteration count = 20, obj. fcn = 6058.692872
Iteration count = 21, obj. fcn = 6058.691181
Iteration count = 22, obj. fcn = 6058.690480
Iteration count = 23, obj. fcn = 6058.690189
Iteration count = 24, obj. fcn = 6058.690068
Iteration count = 25, obj. fcn = 6058.690018
Iteration count = 26, obj. fcn = 6058.689997
Iteration count = 27, obj. fcn = 6058.689988
Iteration count = 28, obj. fcn = 6058.689985
Iteration count = 29, obj. fcn = 6058.689983
Iteration count = 30, obj. fcn = 6058.689983

The figure shows the initial and final fuzzy cluster centers. The bold numbers represent the final fuzzy cluster centers obtained by updating them iteratively.

Contact sales
Free technical kit
Trial software
E-mail this page

Get Pricing and
Licensing Options