Main Content

Compare Lifetime PD Models Using Cross-Validation

This example shows how to compare two lifetime PD models using cross-validation.

Load Data

Load the portfolio data, which includes load and macro information. This is a simulated data set used for illustration purposes.

load RetailCreditPanelData.mat
data = join(data,dataMacro);
disp(head(data))
    ID    ScoreGroup    YOB    Default    Year     GDP     Market
    __    __________    ___    _______    ____    _____    ______

    1      Low Risk      1        0       1997     2.72      7.61
    1      Low Risk      2        0       1998     3.57     26.24
    1      Low Risk      3        0       1999     2.86      18.1
    1      Low Risk      4        0       2000     2.43      3.19
    1      Low Risk      5        0       2001     1.26    -10.51
    1      Low Risk      6        0       2002    -0.59    -22.95
    1      Low Risk      7        0       2003     0.63      2.78
    1      Low Risk      8        0       2004     1.85      9.48

Cross Validation

Because the data is panel data, there are multiple rows for each customer. You set up cross validation partitions over the customer IDs, not over the rows of the data set. In this way, a customer can be in either a training set or a test set, but the rows corresponding to the same customer are not split between training and testing.

nIDs = max(data.ID);
uniqueIDs = unique(data.ID);

NumFolds = 5;
rng('default'); % for reproducibility
c = cvpartition(nIDs,'KFold',NumFolds);

Compare a Logistic and Probit lifetime PD models using the same variables.

CVModels = ["logistic";"probit"];
NumModels = length(CVModels);

AUROC = zeros(NumFolds,NumModels);
RMSE = zeros(NumFolds,NumModels);

for ii=1:NumFolds
      
   fprintf('Fitting models, fold %d\n',ii);
      
   % Get indices for ID partition
   TrainIDInd = training(c,ii);
   TestIDInd = test(c,ii);
   % Convert to row indices
   TrainDataInd = ismember(data.ID,uniqueIDs(TrainIDInd));
   TestDataInd = ismember(data.ID,uniqueIDs(TestIDInd));
   
   % For each model, fit with training data, measure with test data
   for jj=1:NumModels
      % Fit model with training data
      pdModel = fitLifetimePDModel(data(TrainDataInd,:),CVModels(jj),...
             'IDVar','ID','AgeVar','YOB','LoanVars','ScoreGroup',...
             'MacroVars',{'GDP','Market'},'ResponseVar','Default');
      % Measure discrimination on test data
      DiscMeasure = modelDiscrimination(pdModel,data(TestDataInd,:));
      AUROC(ii,jj) = DiscMeasure.AUROC;
      
      % Measure accuracy on test data, grouping by YOB (age) and score group
      AccMeasure = modelAccuracy(pdModel,data(TestDataInd,:),["YOB" "ScoreGroup"]);
      RMSE(ii,jj) = AccMeasure.RMSE;
   end
end
Fitting models, fold 1
Fitting models, fold 2
Fitting models, fold 3
Fitting models, fold 4
Fitting models, fold 5

Using the discrimination and accuracy measures for the different folds, you can compare the models. In this example, the metrics are displayed. You can also compare the mean AUROC or the mean RMSE by comparing the proportion of times a model is superior regarding discrimination or accuracy. The two models in this example are very comparable.

AUROCTable = array2table(AUROC,"RowNames",strcat("Fold ",string(1:NumFolds)),"VariableNames",strcat("AUROC_",CVModels))
AUROCTable=5×2 table
              AUROC_logistic    AUROC_probit
              ______________    ____________

    Fold 1       0.69558           0.6957   
    Fold 2       0.70265          0.70335   
    Fold 3       0.69055          0.69037   
    Fold 4       0.70268          0.70232   
    Fold 5       0.68784          0.68781   

RMSETable = array2table(RMSE,"RowNames",strcat("Fold ",string(1:NumFolds)),"VariableNames",strcat("RMSE_",CVModels))
RMSETable=5×2 table
              RMSE_logistic    RMSE_probit
              _____________    ___________

    Fold 1      0.0019412       0.0020972 
    Fold 2      0.0011167       0.0011644 
    Fold 3      0.0011536       0.0011802 
    Fold 4      0.0010269      0.00097877 
    Fold 5      0.0015965        0.001485 

See Also

| | | | | |

Related Topics