Resilient backpropagation

`net.trainFcn = 'trainrp'`

[net,tr] = train(net,...)

`trainrp`

is a network training function that
updates weight and bias values according to the resilient backpropagation
algorithm (Rprop).

`net.trainFcn = 'trainrp'`

sets the network `trainFcn`

property.

`[net,tr] = train(net,...)`

trains the network
with `trainrp`

.

Training occurs according to `trainrp`

training
parameters, shown here with their default values:

`net.trainParam.epochs` | `1000` | Maximum number of epochs to train |

`net.trainParam.show` | `25` | Epochs between displays ( |

`net.trainParam.showCommandLine` | `false` | Generate command-line output |

`net.trainParam.showWindow` | `true` | Show training GUI |

`net.trainParam.goal` | `0` | Performance goal |

`net.trainParam.time` | `inf` | Maximum time to train in seconds |

`net.trainParam.min_grad` | `1e-5` | Minimum performance gradient |

`net.trainParam.max_fail` | `6` | Maximum validation failures |

`net.trainParam.lr` | `0.01` | Learning rate |

`net.trainParam.delt_inc` | `1.2` | Increment to weight change |

`net.trainParam.delt_dec` | `0.5` | Decrement to weight change |

`net.trainParam.delta0` | `0.07` | Initial weight change |

`net.trainParam.deltamax` | `50.0` | Maximum weight change |

You can create a standard network that uses `trainrp`

with `feedforwardnet`

or `cascadeforwardnet`

.

To prepare a custom network to be trained with `trainrp`

,

Set

`net.trainFcn`

to`'trainrp'`

. This sets`net.trainParam`

to`trainrp`

's default parameters.Set

`net.trainParam`

properties to desired values.

In either case, calling `train`

with the resulting
network trains the network with `trainrp`

.

Here is a problem consisting of inputs `p`

and
targets `t`

to be solved with a network.

p = [0 1 2 3 4 5]; t = [0 0 0 1 1 1];

A two-layer feed-forward network with two hidden neurons and this training function is created.

Create and test a network.

net = feedforwardnet(2,'trainrp');

Here the network is trained and retested.

net.trainParam.epochs = 50; net.trainParam.show = 10; net.trainParam.goal = 0.1; net = train(net,p,t); a = net(p)

See `help feedforwardnet`

and ```
help
cascadeforwardnet
```

for other examples.

Multilayer networks typically use sigmoid transfer functions in the hidden layers. These functions are often called "squashing" functions, because they compress an infinite input range into a finite output range. Sigmoid functions are characterized by the fact that their slopes must approach zero as the input gets large. This causes a problem when you use steepest descent to train a multilayer network with sigmoid functions, because the gradient can have a very small magnitude and, therefore, cause small changes in the weights and biases, even though the weights and biases are far from their optimal values.

The purpose of the resilient backpropagation (Rprop) training
algorithm is to eliminate these harmful effects of the magnitudes
of the partial derivatives. Only the sign of the derivative can determine
the direction of the weight update; the magnitude of the derivative
has no effect on the weight update. The size of the weight change
is determined by a separate update value. The update value for each
weight and bias is increased by a factor `delt_inc`

whenever
the derivative of the performance function with respect to that weight
has the same sign for two successive iterations. The update value
is decreased by a factor `delt_dec`

whenever the
derivative with respect to that weight changes sign from the previous
iteration. If the derivative is zero, the update value remains the
same. Whenever the weights are oscillating, the weight change is reduced.
If the weight continues to change in the same direction for several
iterations, the magnitude of the weight change increases. A complete
description of the Rprop algorithm is given in [RiBr93].

The following code recreates the previous network and trains
it using the Rprop algorithm. The training parameters for `trainrp`

are `epochs`

, `show`

, `goal`

, `time`

, `min_grad`

, `max_fail`

, `delt_inc`

, `delt_dec`

, `delta0`

,
and `deltamax`

. The first eight parameters have been
previously discussed. The last two are the initial step size and the maximum
step size, respectively. The performance of Rprop is not very sensitive
to the settings of the training parameters. For the example below,
the training parameters are left at the default values:

p = [-1 -1 2 2;0 5 0 5]; t = [-1 -1 1 1]; net = feedforwardnet(3,'trainrp'); net = train(net,p,t); y = net(p)

`rprop`

is generally much faster than the standard
steepest descent algorithm. It also has the nice property that it
requires only a modest increase in memory requirements. You do need
to store the update values for each weight and bias, which is equivalent
to storage of the gradient.

Riedmiller, M., and H. Braun, "A direct adaptive method
for faster backpropagation learning: The RPROP algorithm," *Proceedings
of the IEEE International Conference on Neural Networks*,1993,
pp. 586–591.

Was this topic helpful?