How to make cosine Distance classify

Hello! I am a beginner in Matlab.
I have dataset that consisted of 90 data (10 label x 9 data).
Can I get an idea to make classify based on cosine distance or euclidean distance, etc?

2 Comments

Can you show an example of your dataset. For example, attach a small dataset and describe what is your expected output.
Hello.
I attached the file. The dataset is consisted of 120 x 2353 (column 2353 is label, 0~6).
I want to calculate each rows using cosine distance or euclidean distance and classify the result.
Thank you!

Sign in to comment.

 Accepted Answer

If you want to classify a new vector by using the Euclidean or cosine distance between the rows of your matrix and the new vector the try this
data = readmatrix('geo01_KTH.csv');
predictors = data(:, 1:end-1);
labels = data(:, end);
predictors = normalize(predictors, 2, 'range'); % normalize each row to be in range 0-1
x = rand(1, 2352); % generate a random vector
euclidean_dist = pdist2(predictors, x, 'euclidean');
cosine_dist = pdist2(predictors, x, 'cosine');
[~, euclidean_index] = min(euclidean_dist);
[~, cosine_index] = min(cosine_dist);
euclidean_prediction = labels(euclidean_index);
cosine_prediction = labels(cosine_index);

11 Comments

Thank you so much!!!
Can I just use the rows of my matrix using 5 fold cross-validation?
And I want to calculate accuracy of classification.
I am not sure how you can calculate the accuracy or cross-validation with this method. You will need a test and train dataset and train a model. Since you haven't trained a model in this case, so I am not aware of how to calculate these things.
Thank you for the reply.!
I mean, I want to divide my matrix to 5 fold(for example,1:test, 2~5:train) and calculate classification accuracy.
When I use several new vector, how can modify your code?
You can divide the matrix into two parts according to your requirement. For example,
data = readmatrix('geo01_KTH.csv');
predictors = data(:, 1:end-1);
labels = data(:, end);
predictors = normalize(predictors, 2, 'range');
predictors_train = predictors(1:90, :); % rows 1 to 90
predictors_test = predictors(91:120, :); % rows 91 to 100
labels_train = labels(1:90, :); % rows 1 to 90
labels_test = labels(91:120, :); % rows 91 to 100
Then you can apply for loop to check several new vectors with the training or testing matrices.
Hello.
Can I get an idea to use 5 cross-validation and calculate accuracy?
Hello.
I want to know how to use for loop for accuracy.
Can I make for loop like this?
for i = 1:80
euclidean_dist = pdist2(predictors_train, x(i,:), 'euclidean');
[~, euclidean_index] = min(euclidean_dist);
euclidean_prediction = labels(euclidean_index);
end
Kong, yes, you can write the for loop like this to calculate the prediction of each row in x but i am not sure how to do cross-validation.
Thank you.
I got this error. Could you explain how to fix it?
What is the size of predictors_train and x?
I am sorry that I was mistaken.
predictors_train : 80 x 2856, predictors_test : 10 x 2856,
When I modify the code as below, I got this value.
How can I compare this prediction with real labels to calculate accuracy?
for i = 1:10
euclidean_dist{i} = pdist2(predictors_train, predictors_test(i,:), 'euclidean');
[~, euclidean_index{i}] = min(euclidean_dist{i});
euclidean_prediction{i} = labels(euclidean_index{i});
end

Sign in to comment.

More Answers (0)

Asked:

on 13 Mar 2020

Commented:

on 17 Mar 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!