Calculating accuracy with permuted classes

I'm writing a clustering script that assigns each data point a class 1-3. I also have the true classes of the data, which are also 1-3. The class labels are meaningless in themselves, and the numbers of the outputed classes are sensitive to the initialization, which is random. So for example I might get an output class vector:
3333111222
when the true class labels are
1111222333
I want to calculate the accuracy of my program, and repeat this many times (so I can't just look and immediately see what the permutation is). So the accuracy of the above example would be 100%. Or this example below would be 90%:
Output: 2221333111
True labels: 1111222333
Any ideas how to do this? Thanks!

Answers (1)

In the case where the accuracy is 100% then notice,
>> [U,~,uidx] = unique('3333111222','stable')
U =
'312'
uidx =
1
1
1
1
2
2
2
3
3
3
>> [U2,~,uidx2] = unique('1111222333','stable')
U2 =
'123'
uidx2 =
1
1
1
1
2
2
2
3
3
3
and now you can compare the uidx* for equality.
This does not require 100% accuracy, but it does require that the labels be first seen in the same order. For example '3133111222' would not be a problem but '2333111222' would be a problem.
I will keep thinking about better solutions.

1 Comment

Thanks this is helpful. What do you think about this approach: say c is the output classes from clustering and C is the true classes:
conf=confusionmat(C,c)
tot=sum(sum(conf))
correct=sum(max(conf,[],2))
accuracy(i,1)=correct/tot
This works, I think, as long as I get more right than wrong in each class.

Sign in to comment.

Categories

Asked:

on 26 Nov 2017

Commented:

on 26 Nov 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!