Linkages other than Ward in evalcluster

Hi,
Is there any way to change the linkage to any desired linkage (e.g., average, complete, single, cetroid, etc.) in evalcluster?
Right now it says that linkage is set to Ward by default
Please advise

Answers (1)

Hamoon
Hamoon on 14 Sep 2015
Why don't you use linkage or clusterdata ?

4 Comments

Tigran
Tigran on 14 Sep 2015
Edited: Tigran on 14 Sep 2015
I do use linkage for clustering itself.
However, after clustering the data I need to evaluate/validate the resulting clusters. In other words I would like to find the optimal number of clusters for each linkage used. Internal validation criteria (e.g., Calinski-Harabasz, Silhouette, or Davies-Bouldin) take care of this. But, the evalcluster has set linkage to Ward by default (if Euclidean distance is used as a metric), and I was wondering of this can be modified?
Here is a cut from the description: 'If Clust is 'linkage', and Distance is either 'sqEuclidean' or 'Euclidean', then the clustering algorithm uses Euclidean distance and Ward linkage.'
Yes, you can change it. you can define a function handle using clusterdata for that. look at this code for example:
myfunc = @(x,k) clusterdata(x,'linkage','average','maxclust',k);
eva = evalclusters(x,myfunc,'CalinskiHarabasz',...
'klist',[1:6]);
for myfunc, x is the input data and k is the number of clusters. then evalclusters evaluate performance of myfunc clustering (here linkage with average method)
you can change the options for linkage in myfunc, for example you can write this:
myfunc = @(x,k) clusterdata(x,'linkage','weighted','maxclust',k);
check this example:
load fisheriris;
myfunc = @(x,k) clusterdata(x,'linkage','weighted','maxclust',k);
eva = evalclusters(meas,myfunc,'CalinskiHarabasz',...
'klist',[1:6]);
to find out which options you have for myfunc when you want to use linkage, check linkage and clusterdata
Is it clear enough?
Hello, can't we estimate the number of k cluster before doing the clustering? This is unclear to me, especially why using 'maxclust' with k cluster without knowing in advance the best clustering method and number? Please could you provide more precision?
Chris
Chris on 14 Feb 2017
Edited: Chris on 14 Feb 2017
I think for 'maxclust' you put the maximum number of clusters you want evalclusters to test, i.e. the maximum value of 'klist'. Please correct me if I'm wrong.

Sign in to comment.

Asked:

on 14 Sep 2015

Edited:

on 14 Feb 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!