Hierarchical clustering with only merging neighbouring clusters

5 views (last 30 days)
Hi
I want to use a agglomerative hierarchical clustering tree (with Ward's algorithm) in Matlab, i.e. the built in function. But I want that only neighbouring clusters can be merged. How can I achieve this? Or is there perhaps an external toolbox which can do that?

Answers (3)

Image Analyst
Image Analyst on 10 May 2015
It sounds like from the nature of the algorithm that non-neighboring clusters will not be the next pair of clusters to merge. Why do you think it might merge a far distant, non-neighbor cluster into the target cluster when there is a different immediate neighbor cluster that is closer?
  2 Comments
Sepp
Sepp on 10 May 2015
Hmm, yes but in a paper I read the following: "In order to take into account the spatial information, we also add connectivity constraints in the hierarchical clustering algorithm, so that only neighboring clusters can be merged together." Previously it was written that "two clusters are merged if the resulting cluster minimizes the sum of sqared differences of the fMRI signal within all clusters".
The fMRI signal of clusters far apart can be more similar than of neighboring clusters.
Do you now how I can achieve this?
Image Analyst
Image Analyst on 10 May 2015
Sounds like they're somehow clustering first based on just gray level or texture or something, which could possibly get regions no matter where they appear in the image because the criteria has nothing to do with location but only to do with intensity. Then they want to further merge items only if they're spatially close together. It sounds like you have a paper that tells you how to do it so I suggest you follow its directions.

Sign in to comment.


Alfonso Nieto-Castanon
Alfonso Nieto-Castanon on 10 May 2015
I would suggest to request the code from the authors (Thirion et al.? if so, you can start here https://github.com/bthirion/frontiers_2014).
Hierarchical clustering can be stated as an iterative procedure, where you start with each datapoint in a separate cluster, and in each step you find which two clusters best to merge (among all possible pairs between clusters) based on some criterion (in this case trying to keep the similarity of the fMRI signals within each cluster as high as possible). Their modification is simply to limit this search to only spatially-neighboring clusters, instead of searching across all possible cluster pairs, so that the resulting clusters are always spatially contiguous.
  2 Comments
Sepp
Sepp on 13 May 2015
Yes exactly, it is the paper "Which fMRI clustering gives good brain parcellations?" from Thirion et al. I have looked at the code you linked by it is python and I never used python before. They also somehow defined a graph for feeding it into the Ward's algorithm which is not possible in Matlab.
How would you do this in Matlab to limit the search to only spatially-neighboring clusters?
Image Analyst
Image Analyst on 13 May 2015
I would make an attempt to translate the Python code to MATLAB code. Everyone has translated code from one language to another before. Eventually you'll figure it out.

Sign in to comment.


JUNTAO XU
JUNTAO XU on 29 Aug 2019
Edited: JUNTAO XU on 29 Aug 2019
Here I have a naive idea to realise this structuredh hierarchical clustering.
Above all, let me propose a quick demo of the idea
function structured_hac()
T = 1 : 0.2 : 10 * pi;
X = exp(-T / 20) .* cos(T);
Y = exp(-T / 20) .* sin(T);
figure(1);
scatter(X,Y,'filled');
C=[X;Y]';
ANS1=clusterdata(C,'Linkage','ward','Savememory','on','Maxclust',4);
figure(2);
scatter(X,Y,[],ANS1,'filled');
Y1=squareform(pdist(C)); %generate the distance
[m,n]=size(Y1);
Y2=10000*ones(m,n); %core of the idea is this reassginment
for i=1:1:m
for j=1:1:n
if i==j
Y2(i,j)=Y1(i,j);
elseif j==i+1
Y2(i,j)=Y1(i,j);
elseif j==i-1
Y2(i,j)=Y1(i,j);
end
end
end
Z1=linkage(Y1,'ward');
ANS2=cluster(Z1,'Maxclust',4);
figure(3);
scatter(X,Y,[],ANS2,'filled');
figure(4);
dendrogram(Z1);
Z2=linkage(Y2,'ward');
ANS3=cluster(Z2,'Maxclust',4);
figure(5);
scatter(X,Y,[],ANS3,'filled');
figure(6);
dendrogram(Z2);
end
the core of the idea is to firstly use the the pdist() function to generate the distance between all the data in dataset C ,this yield us variable Y1 as in the demo code.
And then we assign those values in Y1, which does not responding to a distance of a pair of neiber data ponit ,to a very big value, such that in the clustering stage, the culster algorithm would not cluster dataset that are not neiber. This reassginment yield us Y2.
with this naive idea,we can got a pretty good result.
result from original HAC withot reassignment.
03.jpg
our result:
02.jpg
from the hierarchical tree we can also see, we have used the minimum modification to achieve the goal——to merge only neighbouring clusters

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!