Main Content

Gradient descent with adaptive learning rate backpropagation

`net.trainFcn = 'traingda'`

[net,tr] = train(net,...)

`traingda`

is a network training function that updates weight and bias
values according to gradient descent with adaptive learning rate.

`net.trainFcn = 'traingda'`

sets the network `trainFcn`

property.

`[net,tr] = train(net,...)`

trains the network with
`traingda`

.

Training occurs according to `traingda`

training parameters, shown here
with their default values:

`net.trainParam.epochs` | `1000` | Maximum number of epochs to train |

`net.trainParam.goal` | `0` | Performance goal |

`net.trainParam.lr` | `0.01` | Learning rate |

`net.trainParam.lr_inc` | `1.05` | Ratio to increase learning rate |

`net.trainParam.lr_dec` | `0.7` | Ratio to decrease learning rate |

`net.trainParam.max_fail` | `6` | Maximum validation failures |

`net.trainParam.max_perf_inc` | `1.04` | Maximum performance increase |

`net.trainParam.min_grad` | `1e-5` | Minimum performance gradient |

`net.trainParam.show` | `25` | Epochs between displays ( |

`net.trainParam.showCommandLine` | `false` | Generate command-line output |

`net.trainParam.showWindow` | `true` | Show training GUI |

`net.trainParam.time` | `inf` | Maximum time to train in seconds |

You can create a standard network that uses `traingda`

with
`feedforwardnet`

or `cascadeforwardnet`

. To prepare a custom
network to be trained with `traingda`

,

Set

`net.trainFcn`

to`'traingda'`

. This sets`net.trainParam`

to`traingda`

’s default parameters.Set

`net.trainParam`

properties to desired values.

In either case, calling `train`

with the resulting network trains the
network with `traingda`

.

See `help feedforwardnet`

and `help cascadeforwardnet`

for examples.

`traingda`

can train any network as long as its weight, net input, and
transfer functions have derivative functions.

Backpropagation is used to calculate derivatives of performance `dperf`

with respect to the weight and bias variables `X`

. Each variable is adjusted
according to gradient descent:

dX = lr*dperf/dX

At each epoch, if performance decreases toward the goal, then the learning rate is
increased by the factor `lr_inc`

. If performance increases by more than the
factor `max_perf_inc`

, the learning rate is adjusted by the factor
`lr_dec`

and the change that increased the performance is not made.

Training stops when any of these conditions occurs:

The maximum number of

`epochs`

(repetitions) is reached.The maximum amount of

`time`

is exceeded.Performance is minimized to the

`goal`

.The performance gradient falls below

`min_grad`

.Validation performance has increased more than

`max_fail`

times since the last time it decreased (when using validation).