Compute cumulative lifetime PD, marginal PD, and survival probability

## Syntax

``LifeTimePredictedPD = predictLifetime(pdModel,data)``
``LifeTimePredictedPD = predictLifetime(___,Name,Value)``

## Description

````LifeTimePredictedPD = predictLifetime(pdModel,data)` computes the cumulative lifetime probability of default (PD), marginal PD, and survival probability. ```

````LifeTimePredictedPD = predictLifetime(___,Name,Value)` specifies options using one or more name-value pair arguments in addition to the input arguments in the previous syntax.```

## Examples

This example shows how to use `fitLifetimePDModel` to fit data with a `Probit` model and then predict the lifetime probability of default (PD).

```load RetailCreditPanelData.mat disp(head(data))```
``` ID ScoreGroup YOB Default Year __ __________ ___ _______ ____ 1 Low Risk 1 0 1997 1 Low Risk 2 0 1998 1 Low Risk 3 0 1999 1 Low Risk 4 0 2000 1 Low Risk 5 0 2001 1 Low Risk 6 0 2002 1 Low Risk 7 0 2003 1 Low Risk 8 0 2004 ```
`disp(head(dataMacro))`
``` Year GDP Market ____ _____ ______ 1997 2.72 7.61 1998 3.57 26.24 1999 2.86 18.1 2000 2.43 3.19 2001 1.26 -10.51 2002 -0.59 -22.95 2003 0.63 2.78 2004 1.85 9.48 ```

Join the two data components into a single data set.

```data = join(data,dataMacro); disp(head(data))```
``` ID ScoreGroup YOB Default Year GDP Market __ __________ ___ _______ ____ _____ ______ 1 Low Risk 1 0 1997 2.72 7.61 1 Low Risk 2 0 1998 3.57 26.24 1 Low Risk 3 0 1999 2.86 18.1 1 Low Risk 4 0 2000 2.43 3.19 1 Low Risk 5 0 2001 1.26 -10.51 1 Low Risk 6 0 2002 -0.59 -22.95 1 Low Risk 7 0 2003 0.63 2.78 1 Low Risk 8 0 2004 1.85 9.48 ```

Partition Data

Separate the data into training and test partitions.

```nIDs = max(data.ID); uniqueIDs = unique(data.ID); rng('default'); % for reproducibility c = cvpartition(nIDs,'HoldOut',0.4); TrainIDInd = training(c); TestIDInd = test(c); TrainDataInd = ismember(data.ID,uniqueIDs(TrainIDInd)); TestDataInd = ismember(data.ID,uniqueIDs(TestIDInd));```

Create a `Probit` Lifetime PD Model

Use `fitLifetimePDModel` to create a `Probit` model using the training data.

```pdModel = fitLifetimePDModel(data(TrainDataInd,:),"Probit",... 'AgeVar','YOB',... 'IDVar','ID',... 'LoanVars','ScoreGroup',... 'MacroVars',{'GDP','Market'},... 'ResponseVar','Default'); disp(pdModel)```
``` Probit with properties: ModelID: "Probit" Description: "" Model: [1x1 classreg.regr.CompactGeneralizedLinearModel] IDVar: "ID" AgeVar: "YOB" LoanVars: "ScoreGroup" MacroVars: ["GDP" "Market"] ResponseVar: "Default" ```

Display the underlying model.

`disp(pdModel.Model)`
```Compact generalized linear regression model: probit(Default) ~ 1 + ScoreGroup + YOB + GDP + Market Distribution = Binomial Estimated Coefficients: Estimate SE tStat pValue __________ _________ _______ ___________ (Intercept) -1.6267 0.03811 -42.685 0 ScoreGroup_Medium Risk -0.26542 0.01419 -18.704 4.5503e-78 ScoreGroup_Low Risk -0.46794 0.016364 -28.595 7.775e-180 YOB -0.11421 0.0049724 -22.969 9.6208e-117 GDP -0.041537 0.014807 -2.8052 0.0050291 Market -0.0029609 0.0010618 -2.7885 0.0052954 388097 observations, 388091 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 1.85e+03, p-value = 0 ```

Predict Lifetime PD on Training and Test Data

Use the `predictLifetime` function to get lifetime PDs on the training or the test data. To get conditional PDs, use the `predict` function. For model validation, use the `modelDiscrimination` and `modelAccuracy` functions on the training or test data.

```DataSetChoice = "Testing"; if DataSetChoice=="Training" Ind = TrainDataInd; else Ind = TestDataInd; end % Predict lifetime PD PD = predictLifetime(pdModel,data(Ind,:)); head(data(Ind,:))```
```ans=8×7 table ID ScoreGroup YOB Default Year GDP Market __ ___________ ___ _______ ____ _____ ______ 2 Medium Risk 1 0 1997 2.72 7.61 2 Medium Risk 2 0 1998 3.57 26.24 2 Medium Risk 3 0 1999 2.86 18.1 2 Medium Risk 4 0 2000 2.43 3.19 2 Medium Risk 5 0 2001 1.26 -10.51 2 Medium Risk 6 0 2002 -0.59 -22.95 2 Medium Risk 7 0 2003 0.63 2.78 2 Medium Risk 8 0 2004 1.85 9.48 ```

Predict Lifetime PD on New Data

Lifetime PD models are used to make predictions on existing loans. The `predictLifetime` function requires projected values for both the loan and macro predictors for the remainder of the life of the loan.

The `DataPredictLifetime.mat` file contains projections for two loans and also for the macro variables. One loan is three years old at the end of 2019, with a lifetime of 10 years, and the other loan is six years old with a lifetime of 10 years. The `ScoreGroup` is constant and the age values are incremental. For the macro variables, the forecasts for the macro predictors must span the longest lifetime in the portfolio.

```load DataPredictLifetime.mat disp(LoanData)```
``` ID ScoreGroup YOB Year ____ _____________ ___ ____ 1304 "Medium Risk" 4 2020 1304 "Medium Risk" 5 2021 1304 "Medium Risk" 6 2022 1304 "Medium Risk" 7 2023 1304 "Medium Risk" 8 2024 1304 "Medium Risk" 9 2025 1304 "Medium Risk" 10 2026 2067 "Low Risk" 7 2020 2067 "Low Risk" 8 2021 2067 "Low Risk" 9 2022 2067 "Low Risk" 10 2023 ```
`disp(MacroScenario)`
``` Year GDP Market ____ ___ ______ 2020 1.1 4.5 2021 0.9 1.5 2022 1.2 5 2023 1.4 5.5 2024 1.6 6 2025 1.8 6.5 2026 1.8 6.5 2027 1.8 6.5 ```
```LifetimeData = join(LoanData,MacroScenario); disp(LifetimeData)```
``` ID ScoreGroup YOB Year GDP Market ____ _____________ ___ ____ ___ ______ 1304 "Medium Risk" 4 2020 1.1 4.5 1304 "Medium Risk" 5 2021 0.9 1.5 1304 "Medium Risk" 6 2022 1.2 5 1304 "Medium Risk" 7 2023 1.4 5.5 1304 "Medium Risk" 8 2024 1.6 6 1304 "Medium Risk" 9 2025 1.8 6.5 1304 "Medium Risk" 10 2026 1.8 6.5 2067 "Low Risk" 7 2020 1.1 4.5 2067 "Low Risk" 8 2021 0.9 1.5 2067 "Low Risk" 9 2022 1.2 5 2067 "Low Risk" 10 2023 1.4 5.5 ```

Predict lifetime PDs and store the output as a new table column for convenience.

```LifetimeData.PredictedPD = predictLifetime(pdModel,LifetimeData); disp(LifetimeData)```
``` ID ScoreGroup YOB Year GDP Market PredictedPD ____ _____________ ___ ____ ___ ______ ___________ 1304 "Medium Risk" 4 2020 1.1 4.5 0.0080202 1304 "Medium Risk" 5 2021 0.9 1.5 0.014093 1304 "Medium Risk" 6 2022 1.2 5 0.018156 1304 "Medium Risk" 7 2023 1.4 5.5 0.020941 1304 "Medium Risk" 8 2024 1.6 6 0.022827 1304 "Medium Risk" 9 2025 1.8 6.5 0.024086 1304 "Medium Risk" 10 2026 1.8 6.5 0.024945 2067 "Low Risk" 7 2020 1.1 4.5 0.0015728 2067 "Low Risk" 8 2021 0.9 1.5 0.0027146 2067 "Low Risk" 9 2022 1.2 5 0.003431 2067 "Low Risk" 10 2023 1.4 5.5 0.0038939 ```

Visualize the predicted lifetime PD for a company.

```CompanyIDChoice = "1304"; CompanyID = str2double(CompanyIDChoice); IndPlot = LifetimeData.ID==CompanyID; plot(LifetimeData.YOB(IndPlot),LifetimeData.PredictedPD(IndPlot)) grid on xlabel('YOB') xticks(LifetimeData.YOB(IndPlot)) ylabel('Lifetime PD') title(strcat("Company ",CompanyIDChoice))```

## Input Arguments

Probability of default model, specified as a `Logistic` or `Probit` object previously created using `fitLifetimePDModel`.

Data Types: `object`

Lifetime data, specified as a `NumRows`-by-`NumCols` table with projected predictor values to make lifetime predictions. The predictor names and data types must be consistent with the underlying model. The `IDVar` property of the `pdModel` input is used to identify the column containing the ID values in the table, and the IDs are used to identify rows corresponding to the different IDs and to make lifetime predictions for each ID.

Data Types: `table`

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: ```LifetimeData = predictLifetime(pdModel,Data,'ProbabilityType','survival')```

Probability type, specified as the comma-separated pair consisting of `'ProbabilityType'` and a character vector or string.

Data Types: `char` | `string`

## Output Arguments

Predicted lifetime PD values, returned as a `NumRows`-by-`1` numeric vector.

Lifetime PD is the probability of a default event over the lifetime of a financial asset.

Lifetime PD typically refers to the cumulative default probability, given by

`$P{D}_{cumulative}\left(t\right)=P\left\{T\le t\right\}$`

where T is the time to default.

For example, the predicted lifetime, cumulative PD for the second year is the probability that the borrower defaults any time between now and two years from now.

A closely related concept used for the computation of the lifetime Expected Credit Loss (ECL) is the marginal PD, given by

`$P{D}_{marginal}=P{D}_{cumulative}\left(t\right)-P{D}_{cumulative}\left(t-1\right)$`

A closely related probability is the survival probability, which is the complement of the cumulative probability and is reported as

`$S\left(t\right)=P\left\{T>1\right\}=1-P{D}_{cumulative}\left(t\right)$`

The following recursive formula shows the relationship between the conditional PDs and the survival probability:

`$\begin{array}{l}S\left(0\right)=1\\ S\left(1\right)=S\left(0\right)\left(1-P{D}_{cond}\left(1\right)\right)\\ ...\\ S\left(t\right)=S\left(t-1\right)\left(1-P{D}_{cond}\left(t\right)\right)\end{array}$`

The `predictLifetime` function calls the `predict` function to get the conditional PD and then converts it to survival, marginal or lifetime cumulative PD using the previous formulas.

## References

[1] Baesens, Bart, Daniel Roesch, and Harald Scheule. Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS. Wiley, 2016.

[2] Bellini, Tiziano. IFRS 9 and CECL Credit Risk Modelling and Validation: A Practical Guide with Examples Worked in R and SAS. San Diego, CA: Elsevier, 2019.

[3] Breeden, Joseph. Living with CECL: The Modeling Dictionary. Santa Fe, NM: Prescient Models LLC, 2018.

Introduced in R2020b