PCA Analysis for clustering

14 views (last 30 days)
arun
arun on 1 May 2012
Hello,
I have a dataset with 5 columns and 7500 rows. I need to find the minimum number of principal components needed to partition the data into the best number of clusters. I used the Princomp command to calculate the eigen values of the principal components but am not able to comprehend the # of principal components needed for the partition which parameter should I use?. Please kindly answer asap. Thanks.
Arun
  1 Comment
Geoff
Geoff on 1 May 2012
My understanding of principal components is that it shows you the most significant orthogonal axes within your data. To me, that means something different to clustering. My approach to clustering is to solve with k-means using several different values of k, and then devise a suitable metric (the hard part) to determine how well my data is clustered for each value of k. I then choose the clustering (and k) which best satisfies my metric. There are probably more scientific ways to go about it.

Sign in to comment.

Answers (0)

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!