use a custom distance with the kmeans

Question

0 votes

Hello everyone.

I'm not very good with matlab so I ask you for help. For a university project I need to be able to group users who are furthest away from each other within a rectangular area. I am using kmeans and I have two possibilities: the first is to create a custom function, but I have read that I should use kMedoids; the second is to pass it the custom distance matrix.

At the moment I am following the second path but I do not understand how to do it. I am attaching the code of the one done so far.

N = 10
x=rand(N,1)*5
y=rand(N,1)*2.5 
figure
scatter(x,y)
M = [x,y]
num_medoids = 1;
eucli_dis = pdist(M);
eucli_dis = squareform(eucli_dis);
inv_eucli_dis = 1./eucli_dis; 
for ii = 1:10
    text(M(ii,1),M(ii,2),num2str(ii));
end
gscatter(M(:,1),M(:,2))

6 Comments
Show 4 older comments Hide 4 older comments

Walter Roberson on 13 Jan 2022

What is the code for your custom distance function?

What is your code for your call to kmedoids ?

Emanuele Gandolfi on 14 Jan 2022

Edited: Walter Roberson on 14 Jan 2022

Open in MATLAB Online

the code for the custom distance function:

function inv_eucli_dis = customeucli (x,y)
M = [x,y]
eucli_dis = 1+sqrt(sum(bsxfun(@minus,M,reshape(M',1,size(M,2),size(M,1))).^2,2)); %calculation of the Euclidean distance
inv_eucli_dis = 1/eucli_dis; %inverse distance calculation
dist_mat = squeeze(inv_eucli_dis); %result representation in matrix form

code for kmedoids:

[idx,C,sumd,D] = Ckmedoids (M,num_medoids, 'distance','customeucli');

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Image Analyst on 14 Jan 2022

Open in MATLAB Online

0 votes

Here is code to randomly lay down points and draw thin black lines between a pair of points if they are far away from each other and draw a green line between endpoints of a pair if the points are close together:

N = 10
x=rand(N,1)*5
y=rand(N,1)*2.5 
plot(x, y, 'b.', 'MarkerSize', 30);
grid on;
xy = [x(:), y(:)];
distances = pdist2(xy, xy)
% Zero out lower triangle because it's a repeat of the upper triangle
distances = triu(distances)
nonZeroIndexes = distances > 0;
medianDistance = median(distances(nonZeroIndexes))
thresholdValue = medianDistance/2; % Whatever you want.
% Find pairs that are far apart.
[rows, columns] = find(distances > thresholdValue);
hold on;
% Plot pairs that are far apart.
for k = 1 : length(rows)
    index1 = columns(k);
    index2 = rows(k);
    xp = [x(index1), x(index2)];
    yp = [y(index1), y(index2)];
    plot(xp, yp, 'k-', 'LineWidth', 1, 'MarkerSize', 30);
end
% Find pairs that are close together.
[rows2, columns2] = find((distances > 0) & (distances <= thresholdValue));
hold on;
% Plot pairs that are close together.
for k = 1 : length(rows2)
    index1 = columns2(k);
    index2 = rows2(k);
    xp = [x(index1), x(index2)];
    yp = [y(index1), y(index2)];
    plot(xp, yp, 'r-', 'LineWidth', 2, 'MarkerSize', 30);
end
title('Black lines are far away, red lines are close')

7 Comments
Show 5 older comments Hide 5 older comments

Emanuele Gandolfi on 15 Jan 2022

yes I've seen them. however, I wanted to know for sure if it could also be solved with kmeans by using a custom distance function or by passing it an inverse distance matrix. that's all

Walter Roberson on 15 Jan 2022

@Emanuele Gandolfi

I wanted to know for sure if it could also be solved with kmeans by using a custom distance function or by passing it an inverse distance matrix.

No, it cannot be solve that way "for sure". Unless there are exactly two choices in two dimensions, then The Voting Paradox shows that there is no possible algorithm that can reliably generate optimal outcomes for all nodes.

Sign in to comment.

Answer 2

Image Analyst on 13 Jan 2022

Open in MATLAB Online

0 votes

Not sure why you think there are clusters. I'd just use pdist2() and then threshold to find points that are farther apart than some distance. Something like

xy = [x(:), y(:)];
distances = pdist2(xy, xy);
thresholdValue = .3; % Whatever you want.
[rows, columns] = find(distances > thresholdValue);

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 3

Image Analyst on 14 Jan 2022

Edited: Image Analyst on 14 Jan 2022

0 votes

dbscan_demo.m

You might also consider SVM. It tries to find a dividing line between two groups such that the gap between the two groups is widest so the two groups are farthest apart. See

https://en.wikipedia.org/wiki/Support-vector_machine

or use dbscan (demo attached):

https://en.wikipedia.org/wiki/DBSCAN

It tries to find all points that can be connected with a distance less than what you specify:

A point that is found to be in a cluster with more than a certain number of close neighbors is called a "core point". It can also be part of the cluster if any points are within that distance. So for example, a dumbbell shape could have core points in the ends, connected points in the middle, and the whole thing being one single cluster. If it's an isolated point not closer to any other point than your specified distance, it's not a core point. (Like point N above.)

In this diagram, minPts = 4. Point A and the other red points are core points, because the area surrounding these points in an ε radius contain at least 4 points (including the point itself). Because they are all reachable from one another, they form a single cluster. Points B and C are not core points, but are reachable from A (via other core points) and thus belong to the cluster as well. Point N is a noise point that is neither a core point nor directly-reachable. You can consider it as essentially being in a clluster all by itself.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

use a custom distance with the kmeans

6 Comments
Show 4 older comments Hide 4 older comments

Accepted Answer

7 Comments
Show 5 older comments Hide 5 older comments

More Answers (2)

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Products

Release

Tags

Community Treasure Hunt

use a custom distance with the kmeans

6 Comments Show 4 older comments Hide 4 older comments

Accepted Answer

7 Comments Show 5 older comments Hide 5 older comments

More Answers (2)

0 Comments Show -2 older comments Hide -2 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

6 Comments
Show 4 older comments Hide 4 older comments

7 Comments
Show 5 older comments Hide 5 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments