This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Optimize a Boosted Regression Ensemble

This example shows how to optimize hyperparameters of a boosted regression ensemble. The optimization minimizes the cross-validation loss of the model.

The problem is to model the efficiency in miles per gallon of an automobile, based on its acceleration, engine displacement, horsepower, and weight. Load the carsmall data, which contains these and other predictors.

load carsmall
X = [Acceleration Displacement Horsepower Weight];
Y = MPG;

Fit a regression ensemble to the data using the LSBoost algorithm, and using surrogate splits. Optimize the resulting model by varying the number of learning cycles, the maximum number of surrogate splits, and the learn rate. Furthermore, allow the optimization to repartition the cross-validation between every iteration.

For reproducibility, set the random seed and use the 'expected-improvement-plus' acquisition function.

rng default
Mdl = fitrensemble(X,Y,...
    'Method','LSBoost',...
    'Learner',templateTree('Surrogate','on'),...
    'OptimizeHyperparameters',{'NumLearningCycles','MaxNumSplits','LearnRate'},...
    'HyperparameterOptimizationOptions',struct('Repartition',true,...
    'AcquisitionFunctionName','expected-improvement-plus'))

|====================================================================================================================|
| Iter | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   | NumLearningC-|    LearnRate | MaxNumSplits |
|      | result |             | runtime     | (observed)  | (estim.)    | ycles        |              |              |
|====================================================================================================================|
|    1 | Best   |      3.5411 |      9.1597 |      3.5411 |      3.5411 |          383 |      0.51519 |            4 |
|    2 | Best   |      3.4755 |     0.41959 |      3.4755 |       3.479 |           16 |      0.66503 |            7 |
|    3 | Best   |      3.1893 |       1.014 |      3.1893 |      3.1893 |           33 |       0.2556 |           92 |
|    4 | Accept |      6.3077 |     0.38674 |      3.1893 |      3.1898 |           13 |    0.0053227 |            5 |
|    5 | Accept |      3.4482 |      7.0547 |      3.1893 |      3.1897 |          302 |      0.50394 |           99 |
|    6 | Accept |      4.2638 |     0.24779 |      3.1893 |      3.1897 |           10 |      0.11317 |           93 |
|    7 | Accept |      3.2449 |     0.25446 |      3.1893 |      3.1898 |           10 |      0.34912 |           93 |
|    8 | Accept |      3.4495 |     0.47793 |      3.1893 |        3.19 |           14 |      0.99651 |           98 |
|    9 | Accept |      5.8544 |      6.8305 |      3.1893 |      3.1904 |          308 |    0.0010002 |            2 |
|   10 | Accept |      3.1985 |     0.30766 |      3.1893 |      3.1876 |           10 |      0.27825 |           96 |
|   11 | Accept |      3.3339 |      10.911 |      3.1893 |      3.1886 |          447 |      0.28212 |           97 |
|   12 | Best   |      2.9764 |      0.3165 |      2.9764 |      3.1412 |           11 |      0.26217 |           98 |
|   13 | Accept |      3.1958 |     0.34108 |      2.9764 |      3.1537 |           10 |      0.26754 |           12 |
|   14 | Accept |      3.2951 |      11.913 |      2.9764 |      3.1458 |          487 |     0.022491 |           57 |
|   15 | Accept |      5.8041 |     0.30311 |      2.9764 |      3.1653 |           10 |     0.032877 |           11 |
|   16 | Accept |      3.4128 |      12.332 |      2.9764 |      3.1677 |          500 |     0.065337 |           19 |
|   17 | Accept |      3.2357 |     0.60773 |      2.9764 |      3.1653 |           24 |      0.30654 |            7 |
|   18 | Accept |      3.2848 |      3.1287 |      2.9764 |      3.1605 |          129 |      0.22496 |           91 |
|   19 | Accept |      3.1073 |     0.46267 |      2.9764 |      3.1408 |           16 |      0.29063 |           97 |
|   20 | Accept |       6.422 |     0.26879 |      2.9764 |      3.1415 |           10 |    0.0010038 |           76 |
|====================================================================================================================|
| Iter | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   | NumLearningC-|    LearnRate | MaxNumSplits |
|      | result |             | runtime     | (observed)  | (estim.)    | ycles        |              |              |
|====================================================================================================================|
|   21 | Accept |      3.2146 |     0.57793 |      2.9764 |       3.157 |           18 |      0.27208 |           96 |
|   22 | Accept |      3.0515 |     0.32591 |      2.9764 |      3.1365 |           10 |      0.29884 |           66 |
|   23 | Accept |      3.3721 |      12.022 |      2.9764 |      3.1357 |          500 |    0.0042631 |           84 |
|   24 | Accept |      3.1053 |      12.322 |      2.9764 |       3.136 |          499 |    0.0093964 |           13 |
|   25 | Accept |      3.1303 |      12.109 |      2.9764 |      3.1357 |          499 |    0.0092601 |           73 |
|   26 | Accept |      3.1956 |      11.603 |      2.9764 |      3.1354 |          500 |    0.0074991 |            6 |
|   27 | Accept |      3.2926 |      11.796 |      2.9764 |      3.1366 |          500 |     0.011141 |           69 |
|   28 | Accept |      4.4567 |      1.3716 |      2.9764 |      3.1372 |           74 |     0.015189 |            1 |
|   29 | Accept |      3.4466 |      4.0423 |      2.9764 |      3.1383 |          186 |      0.99992 |            4 |
|   30 | Accept |      6.1348 |      1.6247 |      2.9764 |       3.137 |           68 |    0.0023006 |           12 |

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 176.43 seconds.
Total objective function evaluation time: 134.532

Best observed feasible point:
    NumLearningCycles    LearnRate    MaxNumSplits
    _________________    _________    ____________

           11             0.26217          98     

Observed objective function value = 2.9764
Estimated objective function value = 3.137
Function evaluation time = 0.3165

Best estimated feasible point (according to models):
    NumLearningCycles    LearnRate    MaxNumSplits
    _________________    _________    ____________

           10             0.29884          66     

Estimated objective function value = 3.137
Estimated function evaluation time = 0.29142
Mdl = 
  classreg.learning.regr.RegressionEnsemble
                         ResponseName: 'Y'
                CategoricalPredictors: []
                    ResponseTransform: 'none'
                      NumObservations: 94
    HyperparameterOptimizationResults: [1×1 BayesianOptimization]
                           NumTrained: 10
                               Method: 'LSBoost'
                         LearnerNames: {'Tree'}
                 ReasonForTermination: 'Terminated normally after completing the requested number of training cycles.'
                              FitInfo: [10×1 double]
                   FitInfoDescription: {2×1 cell}
                       Regularization: []


  Properties, Methods

Compare the loss to that of a boosted, unoptimized model, and to that of the default ensemble.

loss = kfoldLoss(crossval(Mdl,'kfold',10))
loss = 23.3445
Mdl2 = fitrensemble(X,Y,...
    'Method','LSBoost',...
    'Learner',templateTree('Surrogate','on'));
loss2 = kfoldLoss(crossval(Mdl2,'kfold',10))
loss2 = 37.0534
Mdl3 = fitrensemble(X,Y);
loss3 = kfoldLoss(crossval(Mdl3,'kfold',10))
loss3 = 38.4890

For a different way of optimizing this ensemble, see Find the Optimal Number of Splits and Trees for an Ensemble.