How to have a matrix that organize samples according to distance ?

2 views (last 30 days)
Hello,
I want to create a matrix that organize the input samples (51 features * 652 sample) according to the distance between them where the samples that are close to each other should occur sequentially in this matrix. Also, note these samples are vectors of extracted features (not binary). So any ideas of how can I do that?
  2 Comments
Matt J
Matt J on 21 Jul 2018
Edited: Matt J on 21 Jul 2018
Generally speaking, that will be an impossible task. Suppose you had a 2D feature space with sample points that form a circle. The only way to accomplish what you want for the majority of the columns is to scan clockwise or counter-clockwise around the circle perimeter and re-order columns so that neighbors encountered along the circle perimeter form neighboring columns in the sorted matrix. But then the first and last column would end up maximally far apart in the matrix, even though they are neighbors according to 2D Euclidean distance.
It gets even more complicated in higher dimensions. Suppose your points are spread out uniformly over the surface of a sphere. How could you order them as you've described?
Image Analyst
Image Analyst on 21 Jul 2018
I agree with Matt (see his answer below by the way). Let's say you had only 2 features and you have 652 measurements (observations). And you now made a scatterplot of the normalized values of those 652 (x,y) points. It might look like a shotgun blast. Now let's say you used pdist2() to compute a distance of every single point to every other point. So now you had 652*651 distances. Which of those points in the shotgun blast would be at the top of your sorted list, and which would be at the bottom? You have two points that will be closest to each other, and two points that are farthest apart from each other (and these are not necessarily different - the same point could be in both pairs). And how would you choose the order of the points in between?

Sign in to comment.

Accepted Answer

Matt J
Matt J on 21 Jul 2018
Edited: Matt J on 21 Jul 2018
You can use the attached mfile to generate a 652x652 inter-column distance matrix. Then you can sort your data (somehow) based on that. For example,
G=interdists(yourMatrix,'noself');
[~,idx]=sortrows(G);
sortedMatrix=yourMatrix(:,idx);

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!