Identifying Multiple Reject Class threshold values for classisfication in Matlab

Hi all,
I want to ask that I have total 11 classes [A,B,C,D,E,F,G,H,I,J,K] in which 7 classes are used in training i.e [A,B,C,D,E,F,G]. Remaining 4 are not used in training. So, those 4 classes are considered as reject classes i.e. [H,I,J,K], when they are used in testing.
Now, I need to find threshold values, which will assist me in identifying reject classes, by using Matlab. I have studied perfcurve function, but I don't know how can I measure the threshold values for all 4 reject classes.
Please guide me how can I do it?

Answers (1)

If you trained with 7 classes, then your data will be classified as either one of those 7 classes, or as an unknown class if you allow that. Since you didn't train with 4 of the classes, your model that you trained knows nothing whatsoever about any of those 4 other "reject" classes including any thresholds for the 4 other classes. If you have any additional information of the 4 classes, it would be up to you to define attributes for them, like thresholds or whatever. Your model cannot do that.
For example let's say you threshold on childrens' ages and say 6 year olds are in grade/class1, 7 years olds are in grade 2, ... and 12 year olds are in class 7. Now you also have grades 8, 10, 11, and 12 and you ask what class a 14 year old belongs. There's no way to know because you haven't specified that.

2 Comments

Thanks for your reply, you have given a pretty good example in understanding me the concept.
As you have given example, that if somebody ask about the category of 14 year old children, system should say that we don't know about it. This category belongs to the rejected region.
My current system is actually finding the similarity measure between the tested instance with the set of all available classes used in training. So, in that case, my system is currently reporting that the tested sample belongs to one of the class which is used in training, which has high similarity measure.
Now, I want to improve it by implementing reject class threshold. So, system should report that as similarity measure value lies under threshold value, that tested sample belongs to the rejected class.
So, can you please guide me about how can I identify threshold values for those reject classes [H,I,J,K]?
How can I do it by using Matlab code? Please let me know about it.
It all depends on how you're doing the classification. If you made a model for linear discriminant analysis using fitcdiscr() and are applying it yourself, "manually" then you can say that if the generalized distance is more than some amount, call it unknown. But if you're using kmeans(), knnsearch() or fitcknn() then I think it forces your unknown to be in one of the classes no matter how far away it is, unless there is an option I'm not aware of. I think it would be good to play around with the Classification Learner app on the Apps tab of the tool ribbon.

Sign in to comment.

Products

Asked:

on 20 Nov 2016

Commented:

on 20 Nov 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!