How to manually set K-means centroids when classifying an image

28 views (last 30 days)
Hello World (wasn't that what the books told you to print way back when you started doing HTML?...)
I am exploring the kmeans function in matlab to classify an RGB image into three classes. I would like to force the kmeans with regards to the location of the centroids. As I can understand from the documentation, I should use the 'start' option, however I can not figure out how to set it correctly: In the images, I wan't to separate blue sky from water and land. Let's say that I find the sky to have an average RGB value of [120,130,190], water at [110,150,150] and land at [120,140,120]. Could any of you give an example of how to force the kmeans with these centroids? Thank you in advance for any input!

Accepted Answer

Shashank Prasanna
Shashank Prasanna on 27 Mar 2014
if your data matrix X is n-by-p, and you want to cluster the data into 3 clusters, then the location of each centroid is 1-by-p, you can stack the centroids for the 3 clusters into a single matrix which is 3-by-p and provide to kmeans as starting centroids.
C = [120,130,190;110,150,150;120,140,120];
I am assuming here that your matrix X is n-by-3.
This is explained in the documentation:

More Answers (2)

Tom Lane
Tom Lane on 29 Mar 2014
If your goal is to specify the centroids in advance, and not just have kmeans start with them and adjust them as things go along, then I think you don't want to use kmeans at all. Just use pdist2, find the closest centroid for each point, and classify into the cluster defined by the closest centroid.
  2 Comments
Andreas Westergaard
Andreas Westergaard on 29 Mar 2014
Hi Tom. Thanks a lot for your input. I get your point. The reason I wanted to set initial centroids was to enable the classification to discover if one of the classes was not present on a given image, and thus have the algorithm to define one class less.
Image Analyst
Image Analyst on 29 Mar 2014
That is the main reason that automatic thresholds are not always robust. If you have to find something that can range from anywhere of 0% of an image to 100% of an image, using thresholds that force you to pick automatically, or clusters that force you to pick a certain number of clusters, are not robust. They will fail if you don't have the proper number of pixels in the image belonging to those classes. For most or all of my color classification applications I use fixed values to determine the class. I used a training set to determine where the classes will be and then once I decide on them, they are fixed for all images. That way I can get area fractions for all color classes no matter if they are present or 100% or somewhere in between. If you had one cluster and told it to find 4 clusters, it would find 4 clusters but it will chop your image up into 4 clusters when if you had 3 other "real" colors there, it would find them all accurately, whereas in the first case it was calling the cluster 4 clusters when it should actually only be one cluster.

Sign in to comment.


Image Analyst
Image Analyst on 27 Mar 2014
Please mark the Answer as accepted if that's what you were looking for. Thanks.
  3 Comments
Image Analyst
Image Analyst on 27 Mar 2014
Why don't you just manually segment these things. kmeans is appropriate if you have the same number of color classes but they move around in color space all the time (from image to image). If you have known classes, like you know you'll always have clouds, sky, water, sand, and grass, then it's best if you just define those regions in colorspace and segment according to them. What are you going to do if you have 5 classes like I said, and you tell it there are only 3 classes (sky, water, land)? It will fail.
Perhaps you'd like to use this approach (I haven't tried it):
Andreas Westergaard
Andreas Westergaard on 29 Mar 2014
Edited: Andreas Westergaard on 29 Mar 2014
Hi "Image Analyst" I tried the segmentation you suggested and it looks promising. I will pursue it a bit more. My initial idea of using Kmeans was because I need to process images under different light conditions. Thank you again for your valuable input. By the way, I tried to accept your answer as well but apparently I am only allowed to accept one answer. I gave a vote instead...

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!