How can I label my clusters (k-means) ?

I work on diffusion MRI data, and I am trying to do a connectivity-based classification based on --omatrix2 using k-means. FSL (i.e. the diffusion MRI software) gives us the following script to run the clustering algorithm in Matlab :
% Load Matrix2
x=load('fdt_matrix2.dot');
M=full(spconvert(x));
% Calculate cross-correlation
CC = 1+corrcoef(M');
% Do kmeans with k clusters
idx = kmeans(CC,k); % k is the number of clusters
% Load coordinate information to save results
addpath([getenv('FSLDIR') '/etc/matlab']);
[mask,~,scales] = read_avw('fdt_paths');
mask = 0*mask;
coord = load('coords_for_fdt_matrix2')+1;
ind = sub2ind(size(mask),coord(:,1),coord(:,2),coord(:,3));
[~,~,j] = unique(idx);
mask(ind) = j;
save_avw(mask,'clusters','i',scales);
!fslcpgeom fdt_paths clusters
I would like to run this clustering algorithm for each subject, and then make an average over all subjects.
In order to do that, I need to label the clusters in a consistent way - for example, having the cluster with the smallest sum of distances to centroid always labeled 1, and the biggest sum of distances to centroid always labelled 2.
I tried the following code :
for idx = kmeans(CC,2,'Display', 'final')
if 'sumd' < 'final'
idx == 1
else
idx == 2
end
end
But of course it doesn't work...
Error using d_matrix2_cluster (line 9)
Error using lt
Matrix dimensions must agree.
When I replace 'final' by the output value in my command window (e.g. 2050), it does. So I thought of using evalc() but I do not seem to be able to make that work either.
I am very new to Matlab, and a bit stuck - I would be vbery grateful to have the community's help on this. Thank you !
Link to the fsl script: https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FDT/UserGuide

2 Comments

You forgot to attach 'fdt_matrix2.dot'.
FK
FK on 29 Jun 2019
Edited: FK on 29 Jun 2019
Indeed, sorry ! You should be able to see it now.

Sign in to comment.

 Accepted Answer

Exactly what is this supposed to do:
for idx = kmeans(CC,2,'Display', 'final')
Looks like nonsense to me. kmeans() returns a class number (label) for each data point in the CC set of data. Thus those are ALREADY the labels, so what is the for loop for. Not only that but why are you testing idx in the loop? You don't even accept the result of the idx==1 test. It will be a vector of true or false as long as there are rows in CC. Makes absolutely no sense to me.

3 Comments

k-means indeed labels the cluster as "1" and "2" - but I am not sure what those labels represent.
In order to be able to make an average over all clusters, I need to have a consistent cluster labeling, where "1" always refers to the cluster with - for example - the smallest sum of distance to the centroid (but maybe that is already the case?), and not just some random label allocation.
I am absolutely not bound to this loop - it is the only thing I could think of to have a consistent clustering in all subjects. But if you know of an alternative to label clusters in a consistent way, that would of course be great !
kmeans() returns labels randomly each time you run it so a cluster with certain attributes might not have the same label the next time you run it.
If you want consistent labels, you'll have to compute the attribute you want, like mean distance from cluster centroid, and then relabel the labels you initially got from kmeans, so if you have initialIndexes from kmeans() then you'll have to have an output label vector like finalIndexes or whatever.
Thanks a lot for your answer.
I get how to do the first part (i.e. compute the sumd for each cluster), but I am stuck on the second part, as I do not have any association between the index value (1 or 2) and the corresponding sumd value. So for example, I do not know if the high sumd value corresponds to cluster 1 or 2.
I am not sure how to go around this issue - may I please ask you how you would do the relabelling?

Sign in to comment.

More Answers (0)

Asked:

FK
on 27 Jun 2019

Commented:

FK
on 1 Jul 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!