learngdm

Gradient descent with momentum weight and bias learning function

Syntax

[dW,LS] = learngdm(W,P,Z,N,A,T,E,gW,gA,D,LP,LS) info = learngdm('code')

Description

learngdm is the gradient descent with momentum weight and bias learning function.

[dW,LS] = learngdm(W,P,Z,N,A,T,E,gW,gA,D,LP,LS) takes several inputs,

`W`	`S`-by-`R` weight matrix (or `S`-by-`1` bias vector)
`P`	`R`-by-`Q` input vectors (or `ones(1,Q)`)
`Z`	`S`-by-`Q` weighted input vectors
`N`	`S`-by-`Q` net input vectors
`A`	`S`-by-`Q` output vectors
`T`	`S`-by-`Q` layer target vectors
`E`	`S`-by-`Q` layer error vectors
`gW`	`S`-by-`R` gradient with respect to performance
`gA`	`S`-by-`Q` output gradient with respect to performance
`D`	`S`-by-`S` neuron distances
`LP`	Learning parameters, none, `LP = []`
`LS`	Learning state, initially should be = `[]`

and returns

`dW`	`S`-by-`R` weight (or bias) change matrix
`LS`	New learning state

Learning occurs according to learngdm’s learning parameters, shown here with their default values.

`LP.lr - 0.01`	Learning rate
`LP.mc - 0.9`	Momentum constant

info = learngdm('code') returns useful information for each code character vector:

`'pnames'`	Names of learning parameters
`'pdefaults'`	Default learning parameters
`'needg'`	Returns 1 if this function uses `gW` or `gA`

Examples

Here you define a random gradient G for a weight going to a layer with three neurons from an input with two elements. Also define a learning rate of 0.5 and momentum constant of 0.8:

gW = rand(3,2);
lp.lr = 0.5;
lp.mc = 0.8;

Because learngdm only needs these values to calculate a weight change (see “Algorithm” below), use them to do so. Use the default initial learning state.

ls = [];
[dW,ls] = learngdm([],[],[],[],[],[],[],gW,[],[],lp,ls)

learngdm returns the weight change and a new learning state.

Algorithms

learngdm calculates the weight change dW for a given neuron from the neuron’s input P and error E, the weight (or bias) W, learning rate LR, and momentum constant MC, according to gradient descent with momentum:

dW = mc*dWprev + (1-mc)*lr*gW

The previous weight change dWprev is stored and read from the learning state LS.

Version History

Introduced before R2006a