# trainbr

Bayesian regularization backpropagation

## Description

`net.trainFcn = 'trainbr'`

sets the network `trainFcn`

property.

`[`

trains the network with `trainedNet`

,`tr`

] = train(`net`

,...)`trainbr`

.

`trainbr`

is a network training function that updates the weight and bias
values according to Levenberg-Marquardt optimization. It minimizes a combination of squared
errors and weights, and then determines the correct combination so as to produce a network that
generalizes well. The process is called Bayesian regularization.

Training occurs according to `trainbr`

training parameters, shown here
with their default values:

`net.trainParam.epochs`

— Maximum number of epochs to train. The default value is 1000.`net.trainParam.goal`

— Performance goal. The default value is 0.`net.trainParam.mu`

— Marquardt adjustment parameter. The default value is 0.005.`net.trainParam.mu_dec`

— Decrease factor for`mu`

. The default value is 0.1.`net.trainParam.mu_inc`

— Increase factor for`mu`

. The default value is 10.`net.trainParam.mu_max`

— Maximum value for mu. The default value is`1e10`

.`net.trainParam.max_fail`

— Maximum validation failures. The default value is`0`

.`net.trainParam.min_grad`

— Minimum performance gradient. The default value is`1e-7`

.`net.trainParam.show`

— Epochs between displays (`NaN`

for no displays). The default value is 25.`net.trainParam.showCommandLine`

— Generate command-line output. The default value is`false`

.`net.trainParam.showWindow`

— Show training GUI. The default value is`true`

.`net.trainParam.time`

— Maximum time to train in seconds. The default value is`inf`

.

Validation stops are disabled by default (`max_fail = 0`

) so that
training can continue until an optimal combination of errors and weights is found. However,
some weight/bias minimization can still be achieved with shorter training times if validation
is enabled by setting `max_fail`

to 6 or some other strictly positive
value.

## Examples

## Input Arguments

## Output Arguments

## Limitations

This function uses the Jacobian for calculations, which assumes that performance is a mean
or sum of squared errors. Therefore networks trained with this function must use either the
`mse`

or `sse`

performance function.

## More About

## Algorithms

`trainbr`

can train any network as long as its weight, net input, and
transfer functions have derivative functions.

Bayesian regularization minimizes a linear combination of squared errors and weights. It
also modifies the linear combination so that at the end of training the resulting network has
good generalization qualities. See MacKay (*Neural Computation*, Vol. 4, No.
3, 1992, pp. 415 to 447) and Foresee and Hagan (*Proceedings of the International Joint
Conference on Neural Networks*, June, 1997) for more detailed discussions of Bayesian
regularization.

This Bayesian regularization takes place within the Levenberg-Marquardt algorithm.
Backpropagation is used to calculate the Jacobian `jX`

of performance
`perf`

with respect to the weight and bias variables `X`

.
Each variable is adjusted according to Levenberg-Marquardt,

jj = jX * jX je = jX * E dX = -(jj+I*mu) \ je

where `E`

is all errors and `I`

is the identity
matrix.

The adaptive value `mu`

is increased by `mu_inc`

until
the change shown above results in a reduced performance value. The change is then made to the
network, and `mu`

is decreased by `mu_dec`

.

Training stops when any of these conditions occurs:

The maximum number of

`epochs`

(repetitions) is reached.The maximum amount of

`time`

is exceeded.Performance is minimized to the

`goal`

.The performance gradient falls below

`min_grad`

.`mu`

exceeds`mu_max`

.

## References

[1] MacKay, David J. C. "Bayesian
interpolation." *Neural computation.* Vol. 4, No. 3, 1992, pp.
415–447.

[2] Foresee, F. Dan, and Martin T.
Hagan. "Gauss-Newton approximation to Bayesian learning." *Proceedings of the
International Joint Conference on Neural Networks*, June, 1997.

## Version History

**Introduced before R2006a**