Help in K-means Clustering for large data set with 13 attributes

1 view (last 30 days)
Hi Mo Chen, I need some help regarding Kmeans. I have huge data sets .I have 13 attributes which are something like
Time
Heading Of Ship
Ship Speed
GPS Position - Degree Of Longitude
'GPS Position - Degree Of Latitude'
'Roll'
'Roll Rate'
'Yaw Rate'
'Pitch'
'Pitch Rate'
'Heave'
'Vertical Velocity'
I am confused on how to plot. I need to find some modes from the data given.What should I consider in X- axis and Y-axis. ? Any suggestions will be highly appreciated. I have written my own kmeans algorithm by following some tutorial.And tested as given in matlab and works fine. In the example used by matlab in this link.
They have only petal length and petal width. So it is easy to run kmeans. But my case there are 13 attributes so I am confused.
Thank you in advance.

Answers (1)

Walter Roberson
Walter Roberson on 17 Jun 2015
When you have more attributes than will fit on a plot, you take all possible subsets and plot each of those subsets. Yes that does mean 12*13/2 = 13*6 = 78 plots if you plot two per subset.
If you plot 3 per subset and avoid plotting any two together that have already been plotted together, then you can do it with 27 plots, of which 5 have contain a pair that was used at least once before. The subsets are
[1 2 3; 1 4 5; 1 6 7; 1 8 9; 1 10 11; 1 12 13; 2 4 6; 2 5 7; 2 8 10; 2 9 11; 2 12 13; 3 4 7; 3 5 6; 3 8 11; 3 9 10; 3 12 13; 4 8 12; 4 9 13; 5 8 13; 5 9 12; 5 10 11; 6 8 9; 6 10 12; 6 11 13; 7 8 9; 7 10 13; 7 11 12]
This is probably more difficult to read, as it can require that the user rotate 3 dimensional plots to see how pairs of variables work with each other. It is also more difficult for the user to find the plot with the pair of variables they want.
  1 Comment
Bikram Kawan
Bikram Kawan on 20 Jun 2015
I understand. I have been trying to plot with different combination and the result looks so difficult to study. When I run in built kmeans function of Matalab from this link
This function groupped my data into number of cluster I defined. Is this good method or I need to plot choosing only 2 attributes at once ?

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!