kmeans clustering of matrices
Show older comments
Hi All,
I have 12X190 cells. Each cell contains a complex matrix of size n*550 (assuming each row is an observation on 550 variables. The number of observations varies cell to cell but the variables are the same for each matrix). I need to classify these matrices using kmeans and I am trying to cluster the large matrix (i.e., 12*190*n*550 and I am not working with each matrix separately).
Any idea how I can do that? Any method better than kmeans to cluster these data? Any input would be appreciated.
11 Comments
the cyclist
on 4 Jun 2021
I think you'll need to give us a lot more context, for us to be able to help. K-means clustering is typically (always?) used on a number of observations ("n"), each of which has a number of features/variables ("p").
So, I would understand how to apply K-means to one of your n-by-550 arrays, because you have n observations and p=550 features. (I guess I would handle the complex components as two separate features, so maybe n-by-1100?)
But I have no idea what to do with the 12-by-190 cell. What do those 12 and 190 represent? Do you just want to make one large (but 2-dimensional) array, by concatenating each individual matrix? ...
[M{1,1}; M{1,2}; ...
such that you have a (sum of all the individual n values from the 12-by-190 smaller matrices)-by-550 matrix, with lots and lots of observations, but still 550 features?
You seem to want to cluster matrices, not observations ... but I don't really know what that means.
Please give us more context and detail.
the cyclist
on 4 Jun 2021
Wish you had mentioned the labels earlier. :-)
OK, so each matrix is the result of an experiment. And each experiment results in n measurements of 550 features. (The value of n can vary for each experiment.) Each experiment also results in a label.
Then, given a new matrix (with unknown label), you want to assign the correct label.
The major stumbling block (at least in my mind) here is that your measured variables are features of the observations, not of the matrices. If you want to predict the label of an unseen matrix, you need features of the matrices. Presumably you can build features of the matrices from the features of the observations, but I'm not sure how that would work. (Specifically, I don't see how k-means helps.)
I think I would try to simplify this, to really sort out the specifics of how to do this. For example:
- imagine you have the same n for all matrices (and imagine it is small, like 5)
- instead of 550 feature, suppose you only have 3
- instead of 12x190 matrices, just fix that number to something like 10
- instead of 11 labels, maybe just 2 or three
Then really think through what you really mean by "some matrices are more similar to each other, and therefore should have the same label". That thinking might help you see the proper mathematical method for getting there.
the cyclist
on 4 Jun 2021
Edited: the cyclist
on 4 Jun 2021
Sorry, but this makes no sense to me.
Suppose your experiment is on humans, and instead of 550 features, you have just 3 features: Height, Hair color, and Eye color. You do the experiment with this person, and assign label 4.
You are saying that "4" is assigned to height, hair color, and eye color.
Now, you add a new feature: Body temperature. And you want a different label for some reason?
I don't follow your logic.
Image Analyst
on 5 Jun 2021
And I'm still hung up on the data being complex numbers: "Each cell contains a complex matrix". So is each of the n-by-550 numbers complex? What do the real parts represent? What do the imaginary parts represent? I don't know that I've heard of kmeans being applied to complex numbers, though maybe it could.
Aside from that, why did you choose kmeans as your classification algorithm? Did you try calssification learner app and you learded from that that kmeans was the most accurate? If so, it exports the code for you.
Walter Roberson
on 5 Jun 2021
kmeans() does reject complex matrices.
Image Analyst
on 5 Jun 2021
OK, so you're just going to consider the real part of the complex numbers. So, how many clusters do you believe there to be? What did you put in for k (if you put in anything)? Do you think there are 3 clusters? 6? 100? Or no idea?
Susan
on 5 Jun 2021
Accepted Answer
More Answers (0)
Categories
Find more on k-Means and k-Medoids Clustering in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!