Be the first to rate this file! 28 Downloads (last 30 days) File Size: 7.25 KB File ID: #38107
image thumbnail

Rapid Centroid Estimation

by

 

10 Sep 2012 (Updated )

An efficient particle swarm approach for rapid optimization of cluster centroids.

| Watch this File

File Information
Description

IDX = RCE(X, K) partitions the points in the N-by-P data matrix X into K clusters. This partition minimizes the sum, over all clusters, of the within-cluster sums of point-to-cluster-centroid distances.
  
Rows of X correspond to features, columns correspond to variables. By default, RCE uses Euclidean distances.

RCE treats NaNs as missing data, and ignores any rows of X that contain NaNs.

[IDX, C] = RCE(X, K) returns the K cluster centroid locations in the D-by-K matrix C.

[IDX, C, s] = RCE(X, K) returns the search history s of the algorithm

[ ... ] = RCE(..., 'PARAM1',val1, 'PARAM2',val2, ...) specifies optional parameter name/value pairs to control the iterative algorithm used by RCE. Parameters are:

'Distance' - Distance measure, in D-dimensional space, that RCE should minimize with respect to.

Other Parameters:
'P' - Set the P value for minkowski distance, default is 2
'subsprob' - Set the substitution probability (for RCE+), default is 0.00

'swarm' - Set the number of groups in the swarm. default is 1 (normal RCE)
'MaxStagnation' - Set the maximum iteration to stagnate before particle reset (for RCE^r), default is 0
'fun' - defines the minimizing fitness function. The RCE abstract fitness function should obey the syntax of fit = function_name(distance_matrix,labels) the default fitness function used in RCE is the sum of intracluster distances normalized by data volume.
'Display' - Level of display output. Choices are 'off', (the default) and 'on'.
'MaxIter' - Maximum number of iterations allowed. Default is 100.
'dxmax' - Set the maximum dx

Example:

X = [randn(2,50)+ones(2,50), randn(2,50)-ones(2,50)];
figure;
[cidx, ctrs, s] = RCE(X, 2, 'Distance','sqeuclidean', ...
                    'Display','on', ...
                    'swarm',3, ...
                    'subsprob',0.02, ...
'dxmax',0.05, ...
'maxiter',500);
disp('cluster centroids');
cidx
disp('cluster centroids');
ctrs
figure;
silhouette(X',cidx,'sqEuclidean');

RCE is proposed as a derivate of PSC algorithm with radically reduced time complexity. The quality of the results produced are similar to PSC. The advantage of RCE is RCE has lesser time taken per iteration and faster convergence [1-4].

RCE operates inside a globally interdependent neighborhood [1-2]. In the neighborhood, a slight change in the position of a particle triggers a global reaction involving every other particle in the swarm. This reaction naturally occurs to preserve the gravitational equilibrium in the neighborhood. A particle in RCE is a centroid prototype. An RCE group encodes a possible solution to the clustering problem.

Recent updates in RCE includes:
- the new lighter white noise update scheme
- substitution,
- particle reset,
- swarm

The new white noise update scheme reduces the frequency and volume of the random number generated per iteration [4]. Compared to the previously proposed update scheme [1-2], the number of randoms generated in white noise update scheme is independent of the number of particles and the number of groups.

Substitution strategy allows RCE particle to distrupt an achieved equilibrium position by moving towards other particles. [3]

Particle reset reinitialize the x and dx of every particle in the group when stagnation counter exceeds MaxStagnation [4]

Swarm strategy allows multiple groups of RCE to work in paralel. In the swarm strategy each group will share its knowledge to other RCE group. [3]

An increase of computational complexity is apparent in the current swarm strategy. One should consider the computational load that may result from adding more groups in the swarm.

References:
sqseuclidean
[1] M. Yuwono, S.W. Su, B. Moulton, H. Nguyen, "Method for increasing the computational speed of an unsupervised learning approach for data clustering", in Proc 2012 IEEE Congress on Evolutionary Computation, 2012, pp.2957-2964.
[2] M. Yuwono, S.W. Su, B. Moulton, H. Nguyen, "Fast unsupervised learning method for rapid estimation of cluster centroids", in Proc 2012 IEEE Congress on Evolutionary Computation, 2012, pp.889-896.
[3] M. Yuwono, S.W. Su, B. Moulton, & H. Nguyen, "Optimization Strategies for Rapid Centroid Estimation", in Proc 34rd Annual International Conference of the IEEE EMBS, San Diego, 2012, pp.6212-6215.
[4] M. Yuwono, S.W. Su, B. Moulton, & H. Nguyen, "Rapid Centroid Estimation and its Variants", unpublished

Required Products Statistics Toolbox
MATLAB release MATLAB 7.10 (R2010a)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Updates
11 Sep 2012

Updated Description

12 Sep 2012

Description updated

05 Nov 2012

Major changes and bug fixes

Contact us