Accelerating the pace of engineering and science

# regularize

Class: RegressionEnsemble

Find weights to minimize resubstitution error plus penalty term

## Syntax

ens1 = regularize(ens)
ens1 = regularize(ens,Name,Value)

## Description

ens1 = regularize(ens) finds optimal weights for learners in ens by lasso regularization. regularize returns a regression ensemble identical to ens, but with a populated Regularization property.

ens1 = regularize(ens,Name,Value) computes optimal weights with additional options specified by one or more Name,Value pair arguments. You can specify several name-value pair arguments in any order as Name1,Value1,…,NameN,ValueN.

## Input Arguments

 ens A regression ensemble, created by fitensemble.

### Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

 'lambda' Vector of nonnegative regularization parameter values for lasso. For the default setting of lambda, regularize calculates the smallest value lambda_max for which all optimal weights for learners are 0. The default value of lambda is a vector including 0 and nine exponentially-spaced numbers from lambda_max/1000 to lambda_max. Default: [0 logspace(log10(lambda_max/1000),log10(lambda_max),9)] 'npass' Maximal number of passes for lasso optimization, a positive integer. Default: 10 'reltol' Relative tolerance on the regularized loss for lasso, a numeric positive scalar. Default: 1e-3 'verbose' Verbosity level, either 0 or 1. When set to 1, regularize displays more information as it runs. Default: 0

## Output Arguments

 ens1 A regression ensemble. Usually you set ens1 to the same name as ens.

## Definitions

### Lasso

The lasso algorithm finds an optimal set of learner weights αt that minimize

$\sum _{n=1}^{N}{w}_{n}g\left(\left(\sum _{t=1}^{T}{\alpha }_{t}{h}_{t}\left({x}_{n}\right)\right),{y}_{n}\right)+\lambda \sum _{t=1}^{T}|{\alpha }_{t}|.$

Here

• λ ≥ 0 is a parameter you provide, called the lasso parameter.

• ht is a weak learner in the ensemble trained on N observations with predictors xn, responses yn, and weights wn.

• g(f,y) = (fy)2 is the squared error.

## Examples

Regularize an ensemble of bagged trees:

```X = rand(2000,20);
Y = repmat(-1,2000,1);
Y(sum(X(:,1:5),2)>2.5) = 1;
bag = fitensemble(X,Y,'Bag',300,'Tree',...
'type','regression');
bag = regularize(bag,'lambda',[0.001 0.1],'verbose',1);```

regularize reports on its progress.

To see the resulting regularization structure:

```bag.Regularization
ans =
Method: 'Lasso'
TrainedWeights: [300x2 double]
Lambda: [1.0000e-003 0.1000]
ResubstitutionMSE: [0.0616 0.0812]
CombineWeights: @classreg.learning.combiner.WeightedSum```

See how many learners in the regularized ensemble have positive weights (so would be included in a shrunken ensemble):

```sum(bag.Regularization.TrainedWeights > 0)

ans =
116    91```

To shrink the ensemble using the weights from Lambda = 0.1:

```cmp = shrink(bag,'weightcolumn',2)

cmp =

classreg.learning.regr.CompactRegressionEnsemble:
PredictorNames: {1x20 cell}
CategoricalPredictors: []
ResponseName: 'Y'
ResponseTransform: 'none'
NumTrained: 91```

There are 91 members in the regularized ensemble, which is less than 1/3 of the original 300.