This function calculates the Receiver Operating Characteristic curve, which represents the 1-specificity and sensitivity of two classes of data, (i.e., class_1 and class_2).
The function also returns all the needed quantitative parameters: threshold position, distance to the optimum point, sensitivity, specificity, accuracy, area under curve (AROC), positive and negative predicted values (PPV, NPV), false negative and positive rates (FNR, FPR), false discovery rate (FDR), false omission rate (FOR), F1 score, Matthews correlation coefficient (MCC), Informedness (BM) and Markedness; as well as the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).
Example of use:
class_1 = 0.5*randn(100,1);
class_2 = 0.5+0.5*randn(100,1);
roc_curve(class_1, class_2);
Víctor Martínez-Cagigal (2021). ROC Curve (https://www.mathworks.com/matlabcentral/fileexchange/52442-roc-curve), MATLAB Central File Exchange. Retrieved .
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Create scripts with code, output, and formatted text in a single executable document.
Great code, thanks Víctor Martínez-Cagigal
Hi Victor. Excellent code. Saved my life. Thanks a bunch.
Víctor, you saved my day!! I mean it.
Hi @Knut,
First of all, if one class has values that are always above 0, and the other class has always values under 0, then the ROC curve will be perfect (reaching AROC=1), just because you can discriminate between 2 classes perfectly by putting a theshold T=0.
With regard to the manner in which I compute the thresholds... I just take all the values of both classes and merge them. Then, I sort them and I take the difference between those points. Finally, I sum the half of that difference. Thus, I ended getting points in the middle of every single sample, therefore having the optimal number of threshold values between each single sample.
If you have more questions, please write me an e-mail. An please, if you find this code useful give me a rating, it helps me a lot. Regards,
Hi Victor
Great function!
I don't understand how each threshold values is calculated.
Threshold is:
(difference between all values divided by 2) + all values
Why is that? Could you refer me something to read up on?
The two distributions I am testing has a pre-specified cut-off at zero. So all cases above zero is a positive, all cases below zero is a negative.
The way it is computed in the function produces some false negatives of which there really are none.
Amazing work, thank you!!
Excellent! Thank you!
Thank you so much for your reply.
Perfect, thank you!
Hi sharar and Keshav,
Results do not change, just your interpretation of them. This is a basic concept of ROC Curves. You can visualize it by inputting:
class_1 = 0.5*randn(100,1);
class_2 = 0.5+0.5*randn(100,1);
figure(1);
roc_curve(class_1, class_2);
figure(2);
roc_curve(class_2, class_1);
The AUC of the first attempt is 0.7407, while the AUC of the second attempt is 0.2593. Are the results different? No, just the inverse (note that 0.2593 = 1 - 0.7107). That is because you are swapping the positive and negative classes... Just as a recommendation, if any AUC is less than 0.5, that is because you should swap the classes.
Hi Victor,
Thank you for reply. Just as Keshav has asked, I have not either understood the reason why AUC differs when the classes are swapped.
Hi Victor,
Why do I get different results when I swap class_1 and class_2?
Hi @sharar, the script will still work if the classes are not equal. I solved that issue in the last version.
Thank you so much for the code. Just one question. What if the size of two classes are not equal?
Excellent code ! Thank you very much !
THANK YOU
thanks
Te adoro
Can you please update this to three classes? or multi class?
nice,tnx
Hi Victor,
I really thank you for this amazing post! thank you!