Main Content

predictLifetime

Compute cumulative lifetime PD, marginal PD, and survival probability

Description

example

LifeTimePredictedPD = predictLifetime(pdModel,data) computes the cumulative lifetime probability of default (PD), marginal PD, and survival probability.

example

LifeTimePredictedPD = predictLifetime(___,Name,Value) specifies options using one or more name-value pair arguments in addition to the input arguments in the previous syntax.

Examples

collapse all

This example shows how to use fitLifetimePDModel to fit data with a Probit model and then predict the lifetime probability of default (PD).

Load Data

Load the credit portfolio data.

load RetailCreditPanelData.mat
disp(head(data))
    ID    ScoreGroup    YOB    Default    Year
    __    __________    ___    _______    ____

    1      Low Risk      1        0       1997
    1      Low Risk      2        0       1998
    1      Low Risk      3        0       1999
    1      Low Risk      4        0       2000
    1      Low Risk      5        0       2001
    1      Low Risk      6        0       2002
    1      Low Risk      7        0       2003
    1      Low Risk      8        0       2004
disp(head(dataMacro))
    Year     GDP     Market
    ____    _____    ______

    1997     2.72      7.61
    1998     3.57     26.24
    1999     2.86      18.1
    2000     2.43      3.19
    2001     1.26    -10.51
    2002    -0.59    -22.95
    2003     0.63      2.78
    2004     1.85      9.48

Join the two data components into a single data set.

data = join(data,dataMacro);
disp(head(data))
    ID    ScoreGroup    YOB    Default    Year     GDP     Market
    __    __________    ___    _______    ____    _____    ______

    1      Low Risk      1        0       1997     2.72      7.61
    1      Low Risk      2        0       1998     3.57     26.24
    1      Low Risk      3        0       1999     2.86      18.1
    1      Low Risk      4        0       2000     2.43      3.19
    1      Low Risk      5        0       2001     1.26    -10.51
    1      Low Risk      6        0       2002    -0.59    -22.95
    1      Low Risk      7        0       2003     0.63      2.78
    1      Low Risk      8        0       2004     1.85      9.48

Partition Data

Separate the data into training and test partitions.

nIDs = max(data.ID);
uniqueIDs = unique(data.ID);

rng('default'); % for reproducibility
c = cvpartition(nIDs,'HoldOut',0.4);

TrainIDInd = training(c);
TestIDInd = test(c);

TrainDataInd = ismember(data.ID,uniqueIDs(TrainIDInd));
TestDataInd = ismember(data.ID,uniqueIDs(TestIDInd));

Create a Probit Lifetime PD Model

Use fitLifetimePDModel to create a Probit model using the training data.

pdModel = fitLifetimePDModel(data(TrainDataInd,:),"Probit",...
    'AgeVar','YOB',...
    'IDVar','ID',...
    'LoanVars','ScoreGroup',...
    'MacroVars',{'GDP','Market'},...
    'ResponseVar','Default');
disp(pdModel)
  Probit with properties:

        ModelID: "Probit"
    Description: ""
          Model: [1x1 classreg.regr.CompactGeneralizedLinearModel]
          IDVar: "ID"
         AgeVar: "YOB"
       LoanVars: "ScoreGroup"
      MacroVars: ["GDP"    "Market"]
    ResponseVar: "Default"

Display the underlying model.

disp(pdModel.Model)
Compact generalized linear regression model:
    probit(Default) ~ 1 + ScoreGroup + YOB + GDP + Market
    Distribution = Binomial

Estimated Coefficients:
                               Estimate        SE         tStat       pValue   
                              __________    _________    _______    ___________

    (Intercept)                  -1.6267      0.03811    -42.685              0
    ScoreGroup_Medium Risk      -0.26542      0.01419    -18.704     4.5503e-78
    ScoreGroup_Low Risk         -0.46794     0.016364    -28.595     7.775e-180
    YOB                         -0.11421    0.0049724    -22.969    9.6208e-117
    GDP                        -0.041537     0.014807    -2.8052      0.0050291
    Market                    -0.0029609    0.0010618    -2.7885      0.0052954


388097 observations, 388091 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 1.85e+03, p-value = 0

Predict Lifetime PD on Training and Test Data

Use the predictLifetime function to get lifetime PDs on the training or the test data. To get conditional PDs, use the predict function. For model validation, use the modelDiscrimination and modelAccuracy functions on the training or test data.

DataSetChoice = "Testing";
if DataSetChoice=="Training"
    Ind = TrainDataInd;
else
    Ind = TestDataInd;
end

% Predict lifetime PD
PD = predictLifetime(pdModel,data(Ind,:));
head(data(Ind,:))
ans=8×7 table
    ID    ScoreGroup     YOB    Default    Year     GDP     Market
    __    ___________    ___    _______    ____    _____    ______

    2     Medium Risk     1        0       1997     2.72      7.61
    2     Medium Risk     2        0       1998     3.57     26.24
    2     Medium Risk     3        0       1999     2.86      18.1
    2     Medium Risk     4        0       2000     2.43      3.19
    2     Medium Risk     5        0       2001     1.26    -10.51
    2     Medium Risk     6        0       2002    -0.59    -22.95
    2     Medium Risk     7        0       2003     0.63      2.78
    2     Medium Risk     8        0       2004     1.85      9.48

Predict Lifetime PD on New Data

Lifetime PD models are used to make predictions on existing loans. The predictLifetime function requires projected values for both the loan and macro predictors for the remainder of the life of the loan.

The DataPredictLifetime.mat file contains projections for two loans and also for the macro variables. One loan is three years old at the end of 2019, with a lifetime of 10 years, and the other loan is six years old with a lifetime of 10 years. The ScoreGroup is constant and the age values are incremental. For the macro variables, the forecasts for the macro predictors must span the longest lifetime in the portfolio.

load DataPredictLifetime.mat

disp(LoanData)
     ID      ScoreGroup      YOB    Year
    ____    _____________    ___    ____

    1304    "Medium Risk"     4     2020
    1304    "Medium Risk"     5     2021
    1304    "Medium Risk"     6     2022
    1304    "Medium Risk"     7     2023
    1304    "Medium Risk"     8     2024
    1304    "Medium Risk"     9     2025
    1304    "Medium Risk"    10     2026
    2067    "Low Risk"        7     2020
    2067    "Low Risk"        8     2021
    2067    "Low Risk"        9     2022
    2067    "Low Risk"       10     2023
disp(MacroScenario)
    Year    GDP    Market
    ____    ___    ______

    2020    1.1     4.5  
    2021    0.9     1.5  
    2022    1.2       5  
    2023    1.4     5.5  
    2024    1.6       6  
    2025    1.8     6.5  
    2026    1.8     6.5  
    2027    1.8     6.5  
LifetimeData = join(LoanData,MacroScenario);
disp(LifetimeData)
     ID      ScoreGroup      YOB    Year    GDP    Market
    ____    _____________    ___    ____    ___    ______

    1304    "Medium Risk"     4     2020    1.1     4.5  
    1304    "Medium Risk"     5     2021    0.9     1.5  
    1304    "Medium Risk"     6     2022    1.2       5  
    1304    "Medium Risk"     7     2023    1.4     5.5  
    1304    "Medium Risk"     8     2024    1.6       6  
    1304    "Medium Risk"     9     2025    1.8     6.5  
    1304    "Medium Risk"    10     2026    1.8     6.5  
    2067    "Low Risk"        7     2020    1.1     4.5  
    2067    "Low Risk"        8     2021    0.9     1.5  
    2067    "Low Risk"        9     2022    1.2       5  
    2067    "Low Risk"       10     2023    1.4     5.5  

Predict lifetime PDs and store the output as a new table column for convenience.

LifetimeData.PredictedPD = predictLifetime(pdModel,LifetimeData);
disp(LifetimeData)
     ID      ScoreGroup      YOB    Year    GDP    Market    PredictedPD
    ____    _____________    ___    ____    ___    ______    ___________

    1304    "Medium Risk"     4     2020    1.1     4.5       0.0080202 
    1304    "Medium Risk"     5     2021    0.9     1.5        0.014093 
    1304    "Medium Risk"     6     2022    1.2       5        0.018156 
    1304    "Medium Risk"     7     2023    1.4     5.5        0.020941 
    1304    "Medium Risk"     8     2024    1.6       6        0.022827 
    1304    "Medium Risk"     9     2025    1.8     6.5        0.024086 
    1304    "Medium Risk"    10     2026    1.8     6.5        0.024945 
    2067    "Low Risk"        7     2020    1.1     4.5       0.0015728 
    2067    "Low Risk"        8     2021    0.9     1.5       0.0027146 
    2067    "Low Risk"        9     2022    1.2       5        0.003431 
    2067    "Low Risk"       10     2023    1.4     5.5       0.0038939 

Visualize the predicted lifetime PD for a company.

CompanyIDChoice = "1304";
CompanyID = str2double(CompanyIDChoice);
IndPlot = LifetimeData.ID==CompanyID;
plot(LifetimeData.YOB(IndPlot),LifetimeData.PredictedPD(IndPlot))
grid on
xlabel('YOB')
xticks(LifetimeData.YOB(IndPlot))
ylabel('Lifetime PD')
title(strcat("Company ",CompanyIDChoice))

Input Arguments

collapse all

Probability of default model, specified as a Logistic or Probit object previously created using fitLifetimePDModel.

Data Types: object

Lifetime data, specified as a NumRows-by-NumCols table with projected predictor values to make lifetime predictions. The predictor names and data types must be consistent with the underlying model. The IDVar property of the pdModel input is used to identify the column containing the ID values in the table, and the IDs are used to identify rows corresponding to the different IDs and to make lifetime predictions for each ID.

Data Types: table

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: LifetimeData = predictLifetime(pdModel,Data,'ProbabilityType','survival')

Probability type, specified as the comma-separated pair consisting of 'ProbabilityType' and a character vector or string.

Data Types: char | string

Output Arguments

collapse all

Predicted lifetime PD values, returned as a NumRows-by-1 numeric vector.

More About

collapse all

Lifetime PD

Lifetime PD is the probability of a default event over the lifetime of a financial asset.

Lifetime PD typically refers to the cumulative default probability, given by

PDcumulative(t)=P{Tt}

where T is the time to default.

For example, the predicted lifetime, cumulative PD for the second year is the probability that the borrower defaults any time between now and two years from now.

A closely related concept used for the computation of the lifetime Expected Credit Loss (ECL) is the marginal PD, given by

PDmarginal=PDcumulative(t)PDcumulative(t1)

A closely related probability is the survival probability, which is the complement of the cumulative probability and is reported as

S(t)=P{T>1}=1PDcumulative(t)

The following recursive formula shows the relationship between the conditional PDs and the survival probability:

S(0)=1S(1)=S(0)(1PDcond(1))...S(t)=S(t1)(1PDcond(t))

The predictLifetime function calls the predict function to get the conditional PD and then converts it to survival, marginal or lifetime cumulative PD using the previous formulas.

References

[1] Baesens, Bart, Daniel Roesch, and Harald Scheule. Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS. Wiley, 2016.

[2] Bellini, Tiziano. IFRS 9 and CECL Credit Risk Modelling and Validation: A Practical Guide with Examples Worked in R and SAS. San Diego, CA: Elsevier, 2019.

[3] Breeden, Joseph. Living with CECL: The Modeling Dictionary. Santa Fe, NM: Prescient Models LLC, 2018.

Introduced in R2020b