How to apply the nearest neighbor using rangesearch function for different range/scale of X and Y data

Hi, I would like to find the nearest neighbors data using the rangesearch function specifically euclidean distance for the data that has different range/ scale data in X and Y.
For instance, the range for X is between 1.5 and 5.5, while Y has range between 40 and 100.
This is my data:
fData = [ 3.6 79; 1.8 54; 3.333 74 ;2.283 62; 4.533 85; 2.883 55; 4.7 88 ;3.6 85 ;1.95 51 ;4.35 85 ;1.833 54; 3.917 84; 4.2 78; 1.75 47; 4.7 83; 2.167 52; 1.75 62; 4.8 84; 1.6 52; 4.25 79; 1.8 51; 1.75 47; 3.45 78; 3.067 69; 4.533 74; 3.6 83; 1.967 55; 4.083 76; 3.85 78; 4.433 79; 4.3 73; 4.467 77; 3.367 66; 4.033 80; 3.833 74; 2.017 52; 1.867 48; 4.833 80; 1.833 59; 4.783 90 ]
If i apply the max/min or zscore normalization funcion and apply the normalized data in the rangesearch function, how i'm going to get back my original data and plot in a graph I would appreciate if anyone can give some ideas
If i do not concern about the range, i will get inaccurate nearest data points.

2 Comments

You have not defined which distance function you want to use.
hi Walter Roberson, I want to apply the euclidean distance Thanks

Sign in to comment.

 Accepted Answer

rangesearch returns indices. You can use them to index the original data.

4 Comments

It is indeed true. But if i want to plot the result, i could not plot as original data in X and Y axis. The data represent in the X and Y axis are normalized data.
This is my code:
fData = [ 3.6 79; 1.8 54; 3.333 74 ;2.283 62; 4.533 85; 2.883 55; 4.7 88 ;3.6 85 ;1.95 51 ;4.35 85 ;1.833 54; 3.917 84; 4.2 78; 1.75 47; 4.7 83; 2.167 52; 1.75 62; 4.8 84; 1.6 52; 4.25 79; 1.8 51; 1.75 47; 3.45 78; 3.067 69; 4.533 74; 3.6 83; 1.967 55; 4.083 76; 3.85 78; 4.433 79; 4.3 73; 4.467 77; 3.367 66; 4.033 80; 3.833 74; 2.017 52; 1.867 48; 4.833 80; 1.833 59; 4.783 90 ]
[n,dim]=size(fData);
r = 0.5;
[zData, mean_array, sd_array] = zscore_rescale(fData)
idx = randsample(n,2)
X = zData(~ismember(1:n,idx),:); % Training data
Y = zData(idx,:)
figure('Name','Result');
plot(zData(:,1),zData(:,2),'.k');
hold on;
plot(Y(:,1),Y(:,2),'*r');
xlabel('Eruption time (min)') % x-axis label
ylabel('Waiting time to next eruption (min)') % y-axis label
title('Faithful data: Eruptions of Old Faithful')
%[idrx1, dist1] = rangesearch(X,Y,r,'Distance','mahalanobis');
[idrx1, dist1] = rangesearch(X,Y,r,'Distance','euclidean');
disp('idrx1')
disp(idrx1)
disp('X([idrx1{j}],1)')
disp(X([idrx1{1}],:))
disp(X([idrx1{2}],:))
disp('fData([idrx1{j}],1)')
disp(fData([idrx1{1}],:))
disp(fData([idrx1{2}],:))
for j = 1:length(Y)
for dn =1:length(idrx1{j})
if ~isempty(idrx1{j})
plot(X([idrx1{j}],1),X([idrx1{j}],2),'*b','MarkerSize',5,'MarkerFaceColor','b','DisplayName','Imputed Data');
end
end
end
for j = 1:length(Y)
c = Y(j,:);
pos = [c-r (2*r) (2*r)];
rectangle('Position',pos,'Curvature',[1 1])
end
This is the code for scale function:
function [zscoredData, mean_array, sd_array] = zscore_rescale(data, flag, mean_array, sd_array, dim)
tmp = bsxfun(@minus, data, mean_array);
zscoredData = bsxfun(@rdivide, tmp, sd_array);
end
When you do
plot(X([idrx1{j}],1),X([idrx1{j}],2),'*b','MarkerSize',5,'MarkerFaceColor','b','DisplayName','Imputed Data');
it is not obvious to me why the second argument is X rather than Y ?
Hi Walter, Sorry for late respond. In my code, the Y is used for the target / center data points.

Sign in to comment.

More Answers (0)

Asked:

amj
on 18 Mar 2018

Commented:

amj
on 19 Mar 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!