# How to create a confusion matrix.

32 views (last 30 days)
sreelekshmi ms on 9 Mar 2020
Commented: sreelekshmi ms on 10 Mar 2020
How to create a confusion matrix for clustering? For this how can I get predicted class and actual class? And how can I get those TP, TN, FP and FN values from it? I am confused please help me.
##### 2 CommentsShow 1 older commentHide 1 older comment
sreelekshmi ms on 9 Mar 2020
I used confusionmat() also. How can I get the TP, TN, FP, FN values using a MatLab?

Benjamin Großmann on 9 Mar 2020
Lets use the cifar10 demo included in Matlab for your question
clearvars
close all
clc
[m,order] = confusionmat(trueLabels,predictedLabels);
figure
cm = confusionchart(m,order);
Please look at the confusionchart and consider the following for one particular class c,
• The diagonal element of class c is the amount of TP
• Everything inside the predicted class column except the diagonal element is falsely predicted as class c --> FP
• Everything inside the true class row except the diagonal element is of class c but not predicted as c --> FN
• Every other diagonal element except the diagonal element of class c itself is TN.
Now, we can put this easily in in code by summing up row, column and diagonal elements and substracting the TP. Lets pick a class, e.g. c=2 "automobile", and calculate TP, FP, FN, TN for that class
c = 2;
TP = cm.NormalizedValues(c,c) % true class is c and predicted as c
FP = sum(cm.NormalizedValues(:,c))-TP % predicted as c, true class is not c
FN = sum(cm.NormalizedValues(c,:))-TP % true class is c, not predicted as c
TN = sum(diag(cm.NormalizedValues))-TP % true class is not c, not predicted as c
sreelekshmi ms on 10 Mar 2020
Sir, I tried this in a different data set I got some errors like:
" Index in position 1 exceeds array bounds (must not exceed 2).
Error in shpa (line 68)
TP = cm.NormalizedValues(c,c) ; "
clc;
clear;
minpts=6;
epsilon=2;
[idx, corepts] = dbscan(data,epsilon,minpts);
nearEnough = 0.02;
x = data(:,1);
y = data(:,2);
indexesToKeep = false(1, length(x));
for k = 1 : length(x)
distances = sqrt((x(k) - x).^2 + (y(k) - y).^2);
if sum(distances > nearEnough) >= 5
indexesToKeep(k) = true;
end
end
x = x(indexesToKeep);
y = y(indexesToKeep);
P=[x y];
dist2 = (data(:,1) - P(:,1).').^2 + (data(:,2) - P(:,2).').^2;
[~,id] = mink(distances,20,1);
clusters = data(id);
maximum_num_clusters = 7;
id= cluster(Z, 'Maxclust', maximum_num_clusters);
figure()
dendrogram(Z)
uni=length(Z);
outl=rmoutliers(Z);
I=data(:,4);
X= -ones(748,1);
B=[-ones(length(I) - length(id),1);id];
Si=[id;X];
[m,order] = confusionmat(I,idx);
figure
cm = confusionchart(m,order);
c = 3;
TP = cm.NormalizedValues(c,c) ;
FP = sum(cm.NormalizedValues(:,c))-TP ;
FN = sum(cm.NormalizedValues(c,:))-TP ;
TN = sum(diag(cm.NormalizedValues))-TP;
A=(TP+TN)/(TP+TN+FP+FN)*100;