DeLong's test for AUC

This function computes for the AUC of two models being compared. It also computes the DeLong's p-value between the ROC of the two models.
68 Downloads
Updated 7 Sep 2024

View License

The AUC is a reliable indicator on how good a classifier is; noting that model accuracies, sensitivity, specificiy, etc. may yield bias conclusion when the class distribution becomes significantly unbalanced. DeLong's test determines if there is a significant difference between the ROC curves of two models being compared. The test is particularly useful when comparing correlated ROCs such as the result of nested models.
NOTE: The implementation is highly un-vectorized. It is primarily intended for educational purposes and to provide access to a fast and easy implementation of DeLong's test to the MATLAB community. Read more about the DeLong's test in their paper; which can be accessed here:
You may also want to take a look about it here:
Lastly, note that this matlab code is a direct implementation from an article in glassboxmedicine.com entitled: Comparing AUCs of Machine Learning Models using DeLong's Test by Rachel Draelos, MD, PhD. Read about the excellent article here: https://glassboxmedicine.com/2020/02/04/comparing-aucs-of-machine-learning-models-with-delongs-test/
FUNCTION DETAILS:
[auc,p] = delong(a,b,y);
  • a - Vector of values containing the prediction probability of the first model for each sample in the test set.
  • b - Vector of values containing the prediction probability of the second model for each sample in the test set. Variables 'a' and 'b' are interchangable. However, 'a' is customarily the probability vector resulting to the training and evaluation of a model using L features, while 'b' is the result of training a model using L-k features, where L-k denotes the reduction of the feature space.
  • y - Vector of values encoding the true class label of the test samples. Classes should be encoded as: 1 - positive class, and 0 - negative class.
  • - A vector containing the AUCs of the two models being compared. The vector is arranged in such a way: [AUC_model1, AUC_model2]
  • p - DeLong's p-value statistic.
EXAMPLES:
1. Compute for the p-value between the probability vectors a and b resulting from two models, given the true class label vector y.
a = [0.2 0.3 0.6 0.9 0.8]; % output probability vector of model 1
b = [0.1 0.15 0.8 0.7 0.4]; % output probability vector of model 2
y = [0 0 1 1 1]; % true class labels
[~,p] = delong(a,b,y);
2. Compute for the p-value and AUC of two SVM models trained using the fisheriris dataset.
% Loading the fisherirs dataset
load fisheriris
% We will only utilize two of the enumerated species: setosa and
% versicolor. Each class has 50 samples each.
X = meas(1:100,:);
% We encode our dataset in such a way that 0 - setosa; 1 versicolor
Y = [zeros(50,1);ones(50,1)];
% Creation of the training set
n = randperm(100,70); % randomizing 70 samples for training.
X_tr = X(n,:);
Y_tr = Y(n,:);
% Creation of the test set
X_ts = X;
X_ts(n,:) = [];
Y_ts = Y;
Y_ts(n,:) = [];
% Model 1 training and evaluation using all 4 features
model1 = fitcsvm(X_tr,Y_tr);
[~,prob] = predict(model1,X_ts);
a = logsig(prob(:,2));
% Model 2 training and evaluation using the only the first feature
model2 = fitcsvm(X_tr(:,1),Y_tr);
[~,prob] = predict(model2,X_ts(:,1));
b = logsig(prob(:,2));
% Computing for the AUC and delong statistic
[auc, p] = delong(a,b,Y_ts)

Cite As

Rock Tomas (2025). DeLong's test for AUC (https://www.mathworks.com/matlabcentral/fileexchange/172309-delong-s-test-for-auc), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R2024a
Compatible with any release
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes
1.0.2

updates on the description.

1.0.1