Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

formatpoints

Format scorecard points and scaling

Syntax

sc = formatpoints(sc,Name,Value)

Description

example

sc = formatpoints(sc,Name,Value) modifies the scorecard points and scaling using optional name-value pair arguments. For example, use optional name-value pair arguments to change the scaling of the scores or the rounding of the points.

Examples

collapse all

This example shows how to use formatpoints to scale by providing the Worst and Best score values. By using formatpoints to scale, you can put points and scores in a desired range that is more meaningful for practical purposes. Technically, this involves a linear transformation from the unscaled to the scaled points.

Create a creditscorecard object using the CreditCardData.mat file to load the data (using a dataset from Refaat 2011). Use the 'IDVar' argument in creditscorecard to indicate that 'CustID' contains ID information and should not be included as a predictor variable.

load CreditCardData 
sc = creditscorecard(data,'IDVar','CustID');

Perform automatic binning to bin for all predictors.

sc = autobinning(sc);

Fit a linear regression model using default parameters.

sc = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08
2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06
3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601
4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257
5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306
6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078
7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769

Generalized linear regression model:
    status ~ [Linear formula with 8 terms in 7 predictors]
    Distribution = Binomial

Estimated Coefficients:
                   Estimate       SE       tStat       pValue  
                   ________    ________    ______    __________

    (Intercept)    0.70239     0.064001    10.975    5.0538e-28
    CustAge        0.60833      0.24932      2.44      0.014687
    ResStatus        1.377      0.65272    2.1097      0.034888
    EmpStatus      0.88565        0.293    3.0227     0.0025055
    CustIncome     0.70164      0.21844    3.2121     0.0013179
    TmWBank         1.1074      0.23271    4.7589    1.9464e-06
    OtherCC         1.0883      0.52912    2.0569      0.039696
    AMBalance        1.045      0.32214    3.2439     0.0011792


1200 observations, 1192 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16

Display unscaled points for predictors retained in the fitting model and display the minimum and maximum possible unscaled scores.

[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=30x3 table
     Predictors           Bin           Points  
    ____________    _______________    _________

    'CustAge'       '[-Inf,33)'         -0.15894
    'CustAge'       '[33,37)'           -0.14036
    'CustAge'       '[37,40)'          -0.060323
    'CustAge'       '[40,46)'           0.046408
    'CustAge'       '[46,48)'            0.21445
    'CustAge'       '[48,58)'            0.23039
    'CustAge'       '[58,Inf]'             0.479
    'ResStatus'     'Tenant'           -0.031252
    'ResStatus'     'Home Owner'         0.12696
    'ResStatus'     'Other'              0.37641
    'EmpStatus'     'Unknown'          -0.076317
    'EmpStatus'     'Employed'           0.31449
    'CustIncome'    '[-Inf,29000)'      -0.45716
    'CustIncome'    '[29000,33000)'     -0.10466
    'CustIncome'    '[33000,35000)'     0.052329
    'CustIncome'    '[35000,40000)'     0.081611

MinScore = -1.3100
MaxScore = 3.0726

Scale by providing the 'Worst' and 'Best' score values. The range provided below is a common score range. Display the points information again to verify that they are now scaled and also display the scaled minimum and maximum scores.

sc = formatpoints(sc,'WorstAndBestScores',[300 850]);
[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=30x3 table
     Predictors           Bin          Points
    ____________    _______________    ______

    'CustAge'       '[-Inf,33)'        46.396
    'CustAge'       '[33,37)'          48.727
    'CustAge'       '[37,40)'          58.772
    'CustAge'       '[40,46)'          72.167
    'CustAge'       '[46,48)'          93.256
    'CustAge'       '[48,58)'          95.256
    'CustAge'       '[58,Inf]'         126.46
    'ResStatus'     'Tenant'           62.421
    'ResStatus'     'Home Owner'       82.276
    'ResStatus'     'Other'            113.58
    'EmpStatus'     'Unknown'          56.765
    'EmpStatus'     'Employed'         105.81
    'CustIncome'    '[-Inf,29000)'     8.9706
    'CustIncome'    '[29000,33000)'    53.208
    'CustIncome'    '[33000,35000)'     72.91
    'CustIncome'    '[35000,40000)'    76.585

MinScore = 300.0000
MaxScore = 850.0000

As expected, the values of MinScore and MaxScore correspond to the desired worst and best scores.

This example shows how to use formatpoints to scale by providing the Shift and Slope values. By using formatpoints to scale, you can put points and scores in a desired range that is more meaningful for practical purposes. Technically, this involves a linear transformation from the unscaled to the scaled points by the formatpoints function.

Create a creditscorecard object using the CreditCardData.mat file to load the data (using a dataset from Refaat 2011). Use the 'IDVar' argument in creditscorecard to indicate that 'CustID' contains ID information and should not be included as a predictor variable.

load CreditCardData 
sc = creditscorecard(data,'IDVar','CustID');

Perform automatic binning to bin for all predictors.

sc = autobinning(sc);

Fit a linear regression model using default parameters.

sc = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08
2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06
3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601
4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257
5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306
6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078
7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769

Generalized linear regression model:
    status ~ [Linear formula with 8 terms in 7 predictors]
    Distribution = Binomial

Estimated Coefficients:
                   Estimate       SE       tStat       pValue  
                   ________    ________    ______    __________

    (Intercept)    0.70239     0.064001    10.975    5.0538e-28
    CustAge        0.60833      0.24932      2.44      0.014687
    ResStatus        1.377      0.65272    2.1097      0.034888
    EmpStatus      0.88565        0.293    3.0227     0.0025055
    CustIncome     0.70164      0.21844    3.2121     0.0013179
    TmWBank         1.1074      0.23271    4.7589    1.9464e-06
    OtherCC         1.0883      0.52912    2.0569      0.039696
    AMBalance        1.045      0.32214    3.2439     0.0011792


1200 observations, 1192 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16

Display unscaled points for predictors retained in the fitting model and display the minimum and maximum possible unscaled scores.

[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=30x3 table
     Predictors           Bin           Points  
    ____________    _______________    _________

    'CustAge'       '[-Inf,33)'         -0.15894
    'CustAge'       '[33,37)'           -0.14036
    'CustAge'       '[37,40)'          -0.060323
    'CustAge'       '[40,46)'           0.046408
    'CustAge'       '[46,48)'            0.21445
    'CustAge'       '[48,58)'            0.23039
    'CustAge'       '[58,Inf]'             0.479
    'ResStatus'     'Tenant'           -0.031252
    'ResStatus'     'Home Owner'         0.12696
    'ResStatus'     'Other'              0.37641
    'EmpStatus'     'Unknown'          -0.076317
    'EmpStatus'     'Employed'           0.31449
    'CustIncome'    '[-Inf,29000)'      -0.45716
    'CustIncome'    '[29000,33000)'     -0.10466
    'CustIncome'    '[33000,35000)'     0.052329
    'CustIncome'    '[35000,40000)'     0.081611

MinScore = -1.3100
MaxScore = 3.0726

Scale by providing the 'Shift' and 'Slope' values. In this example, there is an arbitrary choice of shift and slope. Display the points information again to verify that they are now scaled and also display the scaled minimum and maximum scores.

sc = formatpoints(sc,'ShiftAndSlope',[300 6]);
[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=30x3 table
     Predictors           Bin          Points
    ____________    _______________    ______

    'CustAge'       '[-Inf,33)'        41.904
    'CustAge'       '[33,37)'          42.015
    'CustAge'       '[37,40)'          42.495
    'CustAge'       '[40,46)'          43.136
    'CustAge'       '[46,48)'          44.144
    'CustAge'       '[48,58)'          44.239
    'CustAge'       '[58,Inf]'         45.731
    'ResStatus'     'Tenant'            42.67
    'ResStatus'     'Home Owner'       43.619
    'ResStatus'     'Other'            45.116
    'EmpStatus'     'Unknown'          42.399
    'EmpStatus'     'Employed'         44.744
    'CustIncome'    '[-Inf,29000)'     40.114
    'CustIncome'    '[29000,33000)'    42.229
    'CustIncome'    '[33000,35000)'    43.171
    'CustIncome'    '[35000,40000)'    43.347

MinScore = 292.1401
MaxScore = 318.4355

This example shows how to use formatpoints to scale by providing the points, odds levels, and PDO (points to double the odds). By using formatpoints to scale, you can put points and scores in a desired range that is more meaningful for practical purposes. Technically, this involves a linear transformation from the unscaled to the scaled points by the formatpoints function.

Create a creditscorecard object using the CreditCardData.mat file to load the data (using a dataset from Refaat 2011). Use the 'IDVar' argument in creditscorecard to indicate that 'CustID' contains ID information and should not be included as a predictor variable.

load CreditCardData 
sc = creditscorecard(data,'IDVar','CustID');

Perform automatic binning to bin for all predictors.

sc = autobinning(sc);

Fit a linear regression model using default parameters.

sc = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08
2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06
3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601
4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257
5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306
6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078
7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769

Generalized linear regression model:
    status ~ [Linear formula with 8 terms in 7 predictors]
    Distribution = Binomial

Estimated Coefficients:
                   Estimate       SE       tStat       pValue  
                   ________    ________    ______    __________

    (Intercept)    0.70239     0.064001    10.975    5.0538e-28
    CustAge        0.60833      0.24932      2.44      0.014687
    ResStatus        1.377      0.65272    2.1097      0.034888
    EmpStatus      0.88565        0.293    3.0227     0.0025055
    CustIncome     0.70164      0.21844    3.2121     0.0013179
    TmWBank         1.1074      0.23271    4.7589    1.9464e-06
    OtherCC         1.0883      0.52912    2.0569      0.039696
    AMBalance        1.045      0.32214    3.2439     0.0011792


1200 observations, 1192 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16

Display unscaled points for predictors retained in the fitting model and display the minimum and maximum possible unscaled scores.

[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=30x3 table
     Predictors           Bin           Points  
    ____________    _______________    _________

    'CustAge'       '[-Inf,33)'         -0.15894
    'CustAge'       '[33,37)'           -0.14036
    'CustAge'       '[37,40)'          -0.060323
    'CustAge'       '[40,46)'           0.046408
    'CustAge'       '[46,48)'            0.21445
    'CustAge'       '[48,58)'            0.23039
    'CustAge'       '[58,Inf]'             0.479
    'ResStatus'     'Tenant'           -0.031252
    'ResStatus'     'Home Owner'         0.12696
    'ResStatus'     'Other'              0.37641
    'EmpStatus'     'Unknown'          -0.076317
    'EmpStatus'     'Employed'           0.31449
    'CustIncome'    '[-Inf,29000)'      -0.45716
    'CustIncome'    '[29000,33000)'     -0.10466
    'CustIncome'    '[33000,35000)'     0.052329
    'CustIncome'    '[35000,40000)'     0.081611

MinScore = -1.3100
MaxScore = 3.0726

Scale by providing the points, odds levels, and PDO (points to double the odds). Suppose that you want a score of 500 points to have odds of 2 (twice as likely to be good than to be bad) and that the odds double every 50 points (so that 550 points would have odds of 4).

sc = formatpoints(sc,'PointsOddsAndPDO',[500 2 50]);
[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=30x3 table
     Predictors           Bin          Points
    ____________    _______________    ______

    'CustAge'       '[-Inf,33)'        52.821
    'CustAge'       '[33,37)'          54.161
    'CustAge'       '[37,40)'          59.934
    'CustAge'       '[40,46)'          67.633
    'CustAge'       '[46,48)'          79.755
    'CustAge'       '[48,58)'          80.905
    'CustAge'       '[58,Inf]'         98.838
    'ResStatus'     'Tenant'           62.031
    'ResStatus'     'Home Owner'       73.444
    'ResStatus'     'Other'            91.438
    'EmpStatus'     'Unknown'          58.781
    'EmpStatus'     'Employed'         86.971
    'CustIncome'    '[-Inf,29000)'     31.309
    'CustIncome'    '[29000,33000)'    56.736
    'CustIncome'    '[33000,35000)'     68.06
    'CustIncome'    '[35000,40000)'    70.173

MinScore = 355.5051
MaxScore = 671.6403

This example shows how to use formatpoints to separate the base points from the rest of the points assigned to each predictor variable. The formatpoints name-value pair argument 'BasePoints' serves this purpose.

Create a creditscorecard object using the CreditCardData.mat file to load the data (using a dataset from Refaat 2011). Use the 'IDVar' argument in creditscorecard to indicate that 'CustID' contains ID information and should not be included as a predictor variable.

load CreditCardData 
sc = creditscorecard(data,'IDVar','CustID');

Perform automatic binning to bin for all predictors.

sc = autobinning(sc);

Fit a linear regression model using default parameters.

sc = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08
2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06
3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601
4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257
5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306
6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078
7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769

Generalized linear regression model:
    status ~ [Linear formula with 8 terms in 7 predictors]
    Distribution = Binomial

Estimated Coefficients:
                   Estimate       SE       tStat       pValue  
                   ________    ________    ______    __________

    (Intercept)    0.70239     0.064001    10.975    5.0538e-28
    CustAge        0.60833      0.24932      2.44      0.014687
    ResStatus        1.377      0.65272    2.1097      0.034888
    EmpStatus      0.88565        0.293    3.0227     0.0025055
    CustIncome     0.70164      0.21844    3.2121     0.0013179
    TmWBank         1.1074      0.23271    4.7589    1.9464e-06
    OtherCC         1.0883      0.52912    2.0569      0.039696
    AMBalance        1.045      0.32214    3.2439     0.0011792


1200 observations, 1192 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16

Display unscaled points for predictors retained in the fitting model and display the minimum and maximum possible unscaled scores.

[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=30x3 table
     Predictors           Bin           Points  
    ____________    _______________    _________

    'CustAge'       '[-Inf,33)'         -0.15894
    'CustAge'       '[33,37)'           -0.14036
    'CustAge'       '[37,40)'          -0.060323
    'CustAge'       '[40,46)'           0.046408
    'CustAge'       '[46,48)'            0.21445
    'CustAge'       '[48,58)'            0.23039
    'CustAge'       '[58,Inf]'             0.479
    'ResStatus'     'Tenant'           -0.031252
    'ResStatus'     'Home Owner'         0.12696
    'ResStatus'     'Other'              0.37641
    'EmpStatus'     'Unknown'          -0.076317
    'EmpStatus'     'Employed'           0.31449
    'CustIncome'    '[-Inf,29000)'      -0.45716
    'CustIncome'    '[29000,33000)'     -0.10466
    'CustIncome'    '[33000,35000)'     0.052329
    'CustIncome'    '[35000,40000)'     0.081611

MinScore = -1.3100
MaxScore = 3.0726

By setting the name-value pair argument BasePoints to true, the points information table reports the base points separately in the first row. The minimum and maximum possible scores are not affected by this option.

sc = formatpoints(sc,'BasePoints',true);
[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=31x3 table
     Predictors           Bin           Points  
    ____________    _______________    _________

    'BasePoints'    'BasePoints'         0.70239
    'CustAge'       '[-Inf,33)'         -0.25928
    'CustAge'       '[33,37)'           -0.24071
    'CustAge'       '[37,40)'           -0.16066
    'CustAge'       '[40,46)'          -0.053933
    'CustAge'       '[46,48)'            0.11411
    'CustAge'       '[48,58)'            0.13005
    'CustAge'       '[58,Inf]'           0.37866
    'ResStatus'     'Tenant'            -0.13159
    'ResStatus'     'Home Owner'        0.026616
    'ResStatus'     'Other'              0.27607
    'EmpStatus'     'Unknown'           -0.17666
    'EmpStatus'     'Employed'           0.21415
    'CustIncome'    '[-Inf,29000)'       -0.5575
    'CustIncome'    '[29000,33000)'       -0.205
    'CustIncome'    '[33000,35000)'    -0.048013

MinScore = -1.3100
MaxScore = 3.0726

This example shows how to use formatpoints to round points. Rounding is usually applied after scaling, otherwise, if the points for a particular predictor are all in a small range, rounding could cause the rounded points for different bins to be the same. Also, rounding all the points may slightly change the minimum and maximum total points.

Create a creditscorecard object using the CreditCardData.mat file to load the data (using a dataset from Refaat 2011). Use the 'IDVar' argument in creditscorecard to indicate that 'CustID' contains ID information and should not be included as a predictor variable.

load CreditCardData 
sc = creditscorecard(data,'IDVar','CustID');

Perform automatic binning to bin for all predictors.

sc = autobinning(sc);

Fit a linear regression model using default parameters.

sc = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08
2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06
3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601
4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257
5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306
6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078
7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769

Generalized linear regression model:
    status ~ [Linear formula with 8 terms in 7 predictors]
    Distribution = Binomial

Estimated Coefficients:
                   Estimate       SE       tStat       pValue  
                   ________    ________    ______    __________

    (Intercept)    0.70239     0.064001    10.975    5.0538e-28
    CustAge        0.60833      0.24932      2.44      0.014687
    ResStatus        1.377      0.65272    2.1097      0.034888
    EmpStatus      0.88565        0.293    3.0227     0.0025055
    CustIncome     0.70164      0.21844    3.2121     0.0013179
    TmWBank         1.1074      0.23271    4.7589    1.9464e-06
    OtherCC         1.0883      0.52912    2.0569      0.039696
    AMBalance        1.045      0.32214    3.2439     0.0011792


1200 observations, 1192 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16

Display unscaled points for predictors retained in the fitting model and display the minimum and maximum possible unscaled scores.

[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=30x3 table
     Predictors           Bin           Points  
    ____________    _______________    _________

    'CustAge'       '[-Inf,33)'         -0.15894
    'CustAge'       '[33,37)'           -0.14036
    'CustAge'       '[37,40)'          -0.060323
    'CustAge'       '[40,46)'           0.046408
    'CustAge'       '[46,48)'            0.21445
    'CustAge'       '[48,58)'            0.23039
    'CustAge'       '[58,Inf]'             0.479
    'ResStatus'     'Tenant'           -0.031252
    'ResStatus'     'Home Owner'         0.12696
    'ResStatus'     'Other'              0.37641
    'EmpStatus'     'Unknown'          -0.076317
    'EmpStatus'     'Employed'           0.31449
    'CustIncome'    '[-Inf,29000)'      -0.45716
    'CustIncome'    '[29000,33000)'     -0.10466
    'CustIncome'    '[33000,35000)'     0.052329
    'CustIncome'    '[35000,40000)'     0.081611

MinScore = -1.3100
MaxScore = 3.0726

Scale points, and display the points information. By default, no rounding is applied.

sc = formatpoints(sc,'WorstAndBestScores',[300 850]);
PointsInfo = displaypoints(sc)
PointsInfo=30x3 table
     Predictors           Bin          Points
    ____________    _______________    ______

    'CustAge'       '[-Inf,33)'        46.396
    'CustAge'       '[33,37)'          48.727
    'CustAge'       '[37,40)'          58.772
    'CustAge'       '[40,46)'          72.167
    'CustAge'       '[46,48)'          93.256
    'CustAge'       '[48,58)'          95.256
    'CustAge'       '[58,Inf]'         126.46
    'ResStatus'     'Tenant'           62.421
    'ResStatus'     'Home Owner'       82.276
    'ResStatus'     'Other'            113.58
    'EmpStatus'     'Unknown'          56.765
    'EmpStatus'     'Employed'         105.81
    'CustIncome'    '[-Inf,29000)'     8.9706
    'CustIncome'    '[29000,33000)'    53.208
    'CustIncome'    '[33000,35000)'     72.91
    'CustIncome'    '[35000,40000)'    76.585

Use the name-value pair argument Round to apply rounding for all points and then display the points information again.

sc = formatpoints(sc,'Round','AllPoints');
PointsInfo = displaypoints(sc)
PointsInfo=30x3 table
     Predictors           Bin          Points
    ____________    _______________    ______

    'CustAge'       '[-Inf,33)'         46   
    'CustAge'       '[33,37)'           49   
    'CustAge'       '[37,40)'           59   
    'CustAge'       '[40,46)'           72   
    'CustAge'       '[46,48)'           93   
    'CustAge'       '[48,58)'           95   
    'CustAge'       '[58,Inf]'         126   
    'ResStatus'     'Tenant'            62   
    'ResStatus'     'Home Owner'        82   
    'ResStatus'     'Other'            114   
    'EmpStatus'     'Unknown'           57   
    'EmpStatus'     'Employed'         106   
    'CustIncome'    '[-Inf,29000)'       9   
    'CustIncome'    '[29000,33000)'     53   
    'CustIncome'    '[33000,35000)'     73   
    'CustIncome'    '[35000,40000)'     77   

This example shows how to use formatpoints to score missing or out-of-range data. When data is scored, some observations can be either missing (NaN, or undefined) or out of range. You will need to decide whether or not points are assigned to these cases. Use the name-value pair argument Missing to do so.

Create a creditscorecard object using the CreditCardData.mat file to load the data (using a dataset from Refaat 2011). Use the 'IDVar' argument in creditscorecard to indicate that 'CustID' contains ID information and should not be included as a predictor variable.

load CreditCardData 
sc = creditscorecard(data,'IDVar','CustID');

Perform automatic binning to bin for all predictors.

sc = autobinning(sc);

Fit a linear regression model using default parameters.

sc = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08
2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06
3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601
4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257
5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306
6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078
7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769

Generalized linear regression model:
    status ~ [Linear formula with 8 terms in 7 predictors]
    Distribution = Binomial

Estimated Coefficients:
                   Estimate       SE       tStat       pValue  
                   ________    ________    ______    __________

    (Intercept)    0.70239     0.064001    10.975    5.0538e-28
    CustAge        0.60833      0.24932      2.44      0.014687
    ResStatus        1.377      0.65272    2.1097      0.034888
    EmpStatus      0.88565        0.293    3.0227     0.0025055
    CustIncome     0.70164      0.21844    3.2121     0.0013179
    TmWBank         1.1074      0.23271    4.7589    1.9464e-06
    OtherCC         1.0883      0.52912    2.0569      0.039696
    AMBalance        1.045      0.32214    3.2439     0.0011792


1200 observations, 1192 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16

Suppose missing observations are added to the data that you want to score. Notice that by default, the points and score assigned to the missing value is NaN.

newdata = data(1:10,:);
newdata.CustAge(1) = NaN;
[Scores,Points] = score(sc,newdata)
Scores = 

       NaN
    1.4646
    0.7662
    1.5779
    1.4535
    1.8944
   -0.0872
    0.9207
    1.0399
    0.8252

Points=10x7 table
    CustAge     ResStatus    EmpStatus    CustIncome     TmWBank     OtherCC     AMBalance
    ________    _________    _________    __________    _________    ________    _________

         NaN    -0.031252    -0.076317     0.43693        0.39607     0.15842    -0.017472
       0.479      0.12696      0.31449     0.43693      -0.033752     0.15842    -0.017472
     0.21445    -0.031252      0.31449    0.081611        0.39607    -0.19168    -0.017472
     0.23039      0.12696      0.31449     0.43693      -0.044811     0.15842      0.35551
       0.479      0.12696      0.31449     0.43693      -0.044811     0.15842    -0.017472
       0.479      0.12696      0.31449     0.43693        0.39607     0.15842    -0.017472
    -0.14036      0.12696    -0.076317    -0.10466      -0.033752     0.15842    -0.017472
     0.23039      0.37641      0.31449     0.43693      -0.033752    -0.19168     -0.21206
     0.23039    -0.031252    -0.076317     0.43693      -0.033752     0.15842      0.35551
     0.23039      0.12696    -0.076317     0.43693      -0.033752     0.15842    -0.017472

Use the name-value pair argument Missing to replace NaN with zero.

sc = formatpoints(sc,'Missing','ZeroPoints');
[Scores,Points] = score(sc,newdata)
Scores = 

    0.9667
    1.4646
    0.7662
    1.5779
    1.4535
    1.8944
   -0.0872
    0.9207
    1.0399
    0.8252

Points=10x7 table
    CustAge     ResStatus    EmpStatus    CustIncome     TmWBank     OtherCC     AMBalance
    ________    _________    _________    __________    _________    ________    _________

     0.10034    -0.031252    -0.076317     0.43693        0.39607     0.15842    -0.017472
       0.479      0.12696      0.31449     0.43693      -0.033752     0.15842    -0.017472
     0.21445    -0.031252      0.31449    0.081611        0.39607    -0.19168    -0.017472
     0.23039      0.12696      0.31449     0.43693      -0.044811     0.15842      0.35551
       0.479      0.12696      0.31449     0.43693      -0.044811     0.15842    -0.017472
       0.479      0.12696      0.31449     0.43693        0.39607     0.15842    -0.017472
    -0.14036      0.12696    -0.076317    -0.10466      -0.033752     0.15842    -0.017472
     0.23039      0.37641      0.31449     0.43693      -0.033752    -0.19168     -0.21206
     0.23039    -0.031252    -0.076317     0.43693      -0.033752     0.15842      0.35551
     0.23039      0.12696    -0.076317     0.43693      -0.033752     0.15842    -0.017472

Use the name-value pair argument Missing to replace the missing value with the minimum points for the predictor that has the missing values, 'CustAge'.

sc = formatpoints(sc,'Missing','MinPoints');
[Scores,Points] = score(sc,newdata)
Scores = 

    0.7074
    1.4646
    0.7662
    1.5779
    1.4535
    1.8944
   -0.0872
    0.9207
    1.0399
    0.8252

Points=10x7 table
    CustAge     ResStatus    EmpStatus    CustIncome     TmWBank     OtherCC     AMBalance
    ________    _________    _________    __________    _________    ________    _________

    -0.15894    -0.031252    -0.076317     0.43693        0.39607     0.15842    -0.017472
       0.479      0.12696      0.31449     0.43693      -0.033752     0.15842    -0.017472
     0.21445    -0.031252      0.31449    0.081611        0.39607    -0.19168    -0.017472
     0.23039      0.12696      0.31449     0.43693      -0.044811     0.15842      0.35551
       0.479      0.12696      0.31449     0.43693      -0.044811     0.15842    -0.017472
       0.479      0.12696      0.31449     0.43693        0.39607     0.15842    -0.017472
    -0.14036      0.12696    -0.076317    -0.10466      -0.033752     0.15842    -0.017472
     0.23039      0.37641      0.31449     0.43693      -0.033752    -0.19168     -0.21206
     0.23039    -0.031252    -0.076317     0.43693      -0.033752     0.15842      0.35551
     0.23039      0.12696    -0.076317     0.43693      -0.033752     0.15842    -0.017472

Use the name-value pair argument Missing to replace the missing value with the maximum points for the predictor that has the missing values, 'CustAge'.

sc = formatpoints(sc,'Missing','MaxPoints');
[Scores,Points] = score(sc,newdata)
Scores = 

    1.3454
    1.4646
    0.7662
    1.5779
    1.4535
    1.8944
   -0.0872
    0.9207
    1.0399
    0.8252

Points=10x7 table
    CustAge     ResStatus    EmpStatus    CustIncome     TmWBank     OtherCC     AMBalance
    ________    _________    _________    __________    _________    ________    _________

       0.479    -0.031252    -0.076317     0.43693        0.39607     0.15842    -0.017472
       0.479      0.12696      0.31449     0.43693      -0.033752     0.15842    -0.017472
     0.21445    -0.031252      0.31449    0.081611        0.39607    -0.19168    -0.017472
     0.23039      0.12696      0.31449     0.43693      -0.044811     0.15842      0.35551
       0.479      0.12696      0.31449     0.43693      -0.044811     0.15842    -0.017472
       0.479      0.12696      0.31449     0.43693        0.39607     0.15842    -0.017472
    -0.14036      0.12696    -0.076317    -0.10466      -0.033752     0.15842    -0.017472
     0.23039      0.37641      0.31449     0.43693      -0.033752    -0.19168     -0.21206
     0.23039    -0.031252    -0.076317     0.43693      -0.033752     0.15842      0.35551
     0.23039      0.12696    -0.076317     0.43693      -0.033752     0.15842    -0.017472

Verify that the minimum and maximum points assigned to the missing data correspond to the minimum and maximum points for 'CustAge'. The points for 'CustAge' are reported in the first five rows of the points information table.

PointsInfo = displaypoints(sc);
PointsInfo(1:5,:)
ans=5x3 table
    Predictors        Bin         Points  
    __________    ___________    _________

    'CustAge'     '[-Inf,33)'     -0.15894
    'CustAge'     '[33,37)'       -0.14036
    'CustAge'     '[37,40)'      -0.060323
    'CustAge'     '[40,46)'       0.046408
    'CustAge'     '[46,48)'        0.21445

min(PointsInfo.Points(1:5))
ans = -0.1589
max(PointsInfo.Points(1:5))
ans = 0.2145

Input Arguments

collapse all

Credit scorecard model, specified as a creditscorecard object. Use creditscorecard to create a creditscorecard object.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: sc = formatpoints(sc,'BasePoints',true,'Round','AllPoints','WorstAndBestScores',[100, 700])

Note

ShiftAndSlope, PointsOddsAndPDO, and WorstAndBestScores are scaling methods and you can use only one of these name-value pair arguments at one time. The other three name-value pair arguments (BasePoints, Missing, and Round) are not scaling methods and can be used together or with any one of the three scaling methods.

collapse all

Indicator for separating base points, specified as a logical scalar. If true, the scorecard explicitly separates base points. If false, the base points are spread across all variables in the creditscorecard object.

Data Types: char

Indicator for points assigned to missing or out-of-range information when scoring, specified as a character vector with a value for NoScore, ZeroPoints, MinPoints, or MaxPoints, where:

  • NoScore — Missing and out-of-range data do not get points assigned and points are set to NaN. Also, the total score is set to NaN.

  • ZeroWOE — Missing or out-of-range data get assigned a zero Weight-of-Evidence (WOE) value.

  • MinPoints — Missing or out-of-range data get the minimum possible points for that predictor. This penalizes the score if higher scores are better.

  • MaxPoints — Missing or out-of-range data get the maximum possible points for that predictor. This penalizes the score if lower scores are better.

Data Types: char

Indicator whether to round points or scores, specified as a character vector with values 'AllPoints', 'FinalScore' or 'None', where:

  • None — No rounding is applied.

  • AllPoints — Apply rounding to each predictor's points before adding up the total score.

  • FinalScore — Round the final score only (rounding is applied after all points are added up).

Data Types: char

Indicator for shift and slope scaling parameters for the credit scorecard, specified using numeric array with two elements [Shift, Slope]. Slope cannot be zero. The ShiftAndSlope values are used scale the scoring model.

Note

ShiftAndSlope, PointsOddsAndPDO, and WorstAndBestScores are scaling methods and you can use only one of these name-value pair arguments at one time. The other three name-value pair arguments (BasePoints, Missing, and Round) are not scaling methods and can be used together or with any one of the three scaling methods.

To remove a previous scaling and revert to unscaled scores, set ShiftAndSlope to[0,1].

Data Types: double

Indicator for target points (Points) for a given odds level (Odds) and the desired number of points to double the odds (PDO), specified using numeric array with three elements [Points,Odds,PDO]. Odds must be a positive number. The PointsOddsAndPDO values are used to find scaling parameters for the scoring model.

Note

The points to double the odds (PDO) may be positive or negative, depending on whether higher scores mean lower risk, or vice versa.

ShiftAndSlope, PointsOddsAndPDO, and WorstAndBestScores are scaling methods and you can use only one of these name-value pair arguments at one time. The other three name-value pair arguments (BasePoints, Missing, and Round) are not scaling methods and can be used together or with any one of the three scaling methods.

To remove a previous scaling and revert to unscaled scores, set ShiftAndSlope to[0,1].

Data Types: double

Indicator for worst (highest risk) and best (lowest risk) scores in the scorecard, specified as a numeric array with two elements [WorstScore,BestScore]. WorstScore and BestScore must be different values. These WorstAndBestScores values are used to find scaling parameters for the scoring model.

Note

WorstScore means the riskiest score, and its value could be lower or higher than the ‘best’ score. In other words, the ‘minimum’ score may be the ‘worst‘ score or the 'best' score, depending on the desired scoring scale.

ShiftAndSlope, PointsOddsAndPDO, and WorstAndBestScores are scaling methods and you can use only one of these name-value pair arguments at one time. The other three name-value pair arguments (BasePoints, Missing, and Round) are not scaling methods and can be used together or with any one of the three scaling methods.

To remove a previous scaling and revert to unscaled scores, set ShiftAndSlope to[0,1].

Data Types: double

Output Arguments

collapse all

Credit scorecard model returned as an updated creditscorecard object. For more information on using the creditscorecard object, see creditscorecard.

Algorithms

The score of an individual i is given by the formula

Score(i) = Shift + Slope*(b0 + b1*WOE1(i) + b2*WOE2(i)+ ... +bp*WOEp(i))

where bj is the coefficient of the jth variable in the model, and WOEj(i) is the Weight of Evidence (WOE) value for the ith individual corresponding to the jth model variable. Shift and Slope are scaling constants further discussed below. The scaling constant can be controlled with formatpoints.

If the data for individual i is in the i-th row of a given dataset, to compute a score, the data(i,j) is binned using existing binning maps, and converted into a corresponding Weight of Evidence value WOEj(i). Using the model coefficients, the unscaled score is computed as

 s = b0 + b1*WOE1(i) + ... +bp*WOEp(i).

For simplicity, assume in the description above that the j-th variable in the model is the j-th column in the data input, although, in general, the order of variables in a given dataset does not have to match the order of variables in the model, and the dataset could have additional variables that are not used in the model.

The formatting options can be controlled using formatpoints. When the base points are reported separately (see the formatpoints parameter BasePoints), the base points are given by

Base Points = Shift + Slope*b0,
and the points for the j-th predictor, i-th row are given by
Points_ji = Slope*(bj*WOEj(i))).

By default, the base points are not reported separately, in which case

Points_ji = (Shift + Slope*b0)/p + Slope*(bj*WOEj(i)),
where p is the number of predictors in the scorecard model.

By default, no rounding is applied to the points by the score function (Round is None). If Round is set to AllPoints using formatpoints, then the points for individual i for variable j are given by

 points if rounding is 'AllPoints': round( Points_ji )
and, if base points are reported separately, the are also rounded. This yields integer-valued points per predictor, hence also integer-valued scores. If Round is set to FinalScore using formatpoints, then the points per predictor are not rounded, and only the final score is rounded
 score if rounding is 'FinalScore': round(Score(i)).

Regarding the scaling parameters, the Shift parameter, and the Slope parameter can be set directly with the ShiftAndSlope parameter of formatpoints. Alternatively, you can use the formatpoints parameter for WorstAndBestScores. In this case, the parameters Shift and Slope are found internally by solving the system

Shift + Slope*smin = WorstScore,
Shift + Slope*smax = BestScore,
where WorstScore and BestScore are the first and second elements in the formatpoints parameter for WorstAndBestScores and smin and smax are the minimum and maximum possible unscaled scores:
smin = b0 + min(b1*WOE1) + ... +min(bp*WOEp),
smax = b0 + max(b1*WOE1) + ... +max(bp*WOEp).

A third alternative to scale scores is the PointsOddsAndPDO parameter in formatpoints. In this case, assume that the unscaled score s gives the log-odds for a row, and the Shift and Slope parameters are found by solving the following system

Points = Shift + Slope*log(Odds)
Points + PDO = Shift + Slope*log(2*Odds)
where Points, Odds, and PDO ("points to double the odds") are the first, second, and third elements in the PointsOddsAndPDO parameter.

Whenever a given dataset has a missing or out-of-range value data (i,j), the points for predictor j, for individual i, are set to NaN by default, which results in a missing score for that row (a NaN score). Using the Missing parameter for formatpoints, you can modify this behavior and set the corresponding Weight-of-Evidence (WOE) value to zero, or set the points to the minimum points, or the maximum points for that predictor.

References

[1] Anderson, R. The Credit Scoring Toolkit. Oxford University Press, 2007.

[2] Refaat, M. Credit Risk Scorecards: Development and Implementation Using SAS. lulu.com, 2011.

Introduced in R2014b

Was this topic helpful?