The number of rows in X must match the length of CLUST

2 views (last 30 days)
I'm following this (Create silhouette plot) example, but on fcm.
I did the following:
I=imread('image.png');
I2=im2double(I);
I3=I2(:);
cidx = fcm(I3,5);
But, when I run the following command:
silhouette(I3,cidx)
I get the following error:
Error using silhouette (line 82) The number of rows in X must match the length of CLUST. How can I fix that?
Thanks.

Answers (3)

Wayne King
Wayne King on 31 Dec 2013
Edited: Wayne King on 31 Dec 2013
rng default;
X = [randn(10,2)+2*ones(10,2); randn(10,2)-2*ones(10,2)];
[Cntr,U] = fcm(X,2);
Cluster centers are
Cntr
-1.6498 -2.4173
2.7346 2.7327
Now look at the cluster membership for the first ten rows of X
U(:,1:10)
clearly X(1,:) belongs to cluster 2 because 0.8603 is much larger than 0.1397, X(2,:) even more so because 0.9293 is close to 1. So you would create an idx vector with a 2 for the first row of X and a 2 for the second row of X. You would continue on in that way.
If you look at:
U(:,11)
0.9454
0.0546
that means that X(11,:) belongs to cluster 1.
As the documentation clearly explains the closer the value to 0 the more unlike that cluster an observation is, the closer to 1 the more like it.
You have five clusters so you are going to have five rows in U, pick the largest value in those rows as the cluster. But of course, you might not have something clearly belonging to one cluster.
Ultimately you want to use the information in U to create a vector with the same number of rows as your input with a 1,2,3,4, or 5 as the element which indicates cluster membership.
To form the cluster membership vector, you can do something like:
X = [randn(10,2)+2*ones(10,2); randn(10,2)-2*ones(10,2)];
[Cntr,U] = fcm(X,2);
[Y,cidx] = max(U,[],1);
cidx = cidx';
Thank you for accepting my answer if I have helped you.
  2 Comments
med-sweng
med-sweng on 31 Dec 2013
For instance, my input has the following number of rows: 155160
Do you mean going through all the rows in the membership matrix, and taking the maximum value of each row, which will eventually represent the cluster? Should I only take the maximum number, or give that maximum values the corresponding cluster value (i.e; 1, 2, 3, 4, 5)?
Thanks.
Wayne King
Wayne King on 31 Dec 2013
I've shown you exactly how to do it for 2 clusters above, so just change to 5 above. But you have to keep in mind that taking the maximum doesn't mean that these elements are going to be clear members of one cluster or another.
For example:
0.12 0.14 0.13 0.11 0.10
0.14 is the maximum so that would be cluster 2, but there's not a clear winner.

Sign in to comment.


Wayne King
Wayne King on 31 Dec 2013
Edited: Wayne King on 31 Dec 2013
You have to tell us what the lengths of I3 and cidx are.
The number of rows in I3 has to match the number of rows in cidx.
The first output of fcm() (the one you are using here) is not going to give a length that is equal to the length of the input. That just gives you the cluster centers.
You would have to come up with a vector of cluster memberships for each row in I3 based on the output of fcm() -- you could use the optional output U to assign a cluster membership to each row of I3.
  1 Comment
med-sweng
med-sweng on 31 Dec 2013
Edited: med-sweng on 31 Dec 2013
Thanks for your reply. The number of rows are as follows:
I3 = 155160
cdx = 5
What should I do in this case? Can I "pad" some empty rows to solve the issue?

Sign in to comment.


Wayne King
Wayne King on 31 Dec 2013
No, did you read my answer? Use the optional output, U, from fcm() and then you'll have to assign cluster membership based on grade of membership -- from the help for fcm()
"The membership function matrix U contains the grade of membership of each DATA point in each cluster. The values 0 and 1 indicate no membership and full membership respectively. Grades between 0 and 1 indicate that the data point has partial membership in a cluster."
  1 Comment
med-sweng
med-sweng on 31 Dec 2013
Yes, I checked the fcm documentation. But, still not clear how I would do your suggestion. Appreciate if you can kindly clarify me that. Thanks.

Sign in to comment.

Categories

Find more on Data Clustering in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!