Products & Services Solutions Academia Support User Community Company

Architecture

This section presents the architecture of the network that is most commonly used with the backpropagation algorithm--the multilayer feedforward network.

Neuron Model (logsig, tansig, purelin)

An elementary neuron with R inputs is shown below. Each input is weighted with an appropriate w. The sum of the weighted inputs and the bias forms the input to the transfer function f. Neurons can use any differentiable transfer function f to generate their output.

Multilayer networks often use the log-sigmoid transfer function logsig.

The function logsig generates outputs between 0 and 1 as the neuron's net input goes from negative to positive infinity.

Alternatively, multilayer networks can use the tan-sigmoid transfer function tansig.

Occasionally, the linear transfer function purelin is used in backpropagation networks.

If the last layer of a multilayer network has sigmoid neurons, then the outputs of the network are limited to a small range. If linear output neurons are used the network outputs can take on any value.

In backpropagation it is important to be able to calculate the derivatives of any transfer functions used. Each of the transfer functions above, logsig, tansig, and purelin, can be called to calculate its own derivative. To calculate a transfer function's derivative, call the transfer function with the string 'dn'.

The three transfer functions described here are the most commonly used transfer functions for backpropagation, but other differentiable transfer functions can be created and used with backpropagation if desired. See Advanced Topics.

Feedforward Network

A single-layer network of S logsig neurons having R inputs is shown below in full detail on the left and with a layer diagram on the right.

Feedforward networks often have one or more hidden layers of sigmoid neurons followed by an output layer of linear neurons. Multiple layers of neurons with nonlinear transfer functions allow the network to learn nonlinear and linear relationships between input and output vectors. The linear output layer lets the network produce values outside the range -1 to +1.

On the other hand, if you want to constrain the outputs of a network (such as between 0 and 1), then the output layer should use a sigmoid transfer function (such as logsig).

As noted in Neuron Model and Network Architectures, for multiple-layer networks the number of layers determines the superscript on the weight matrices. The appropriate notation is used in the two-layer tansig/purelin network shown next.

This network can be used as a general function approximator. It can approximate any function with a finite number of discontinuities arbitrarily well, given sufficient neurons in the hidden layer.

Creating a Network (newff)

The first step in training a feedforward network is to create the network object. The function newff creates a feedforward network. It requires three arguments and returns the network object. The first argument is a matrix of sample R-element input vectors. The second argument is a matrix of sample S-element target vectors. The sample inputs and outputs are used to set up network input and output dimensions and parameters. The third argument is an array containing the sizes of each hidden layer. (The output layer size is determined from the targets.)

More optional arguments can be provided. For instance, the fourth argument is a cell array containing the names of the transfer functions to be used in each layer. The fifth argument contains the name of the training function to be used. If only three arguments are supplied, the default transfer function for hidden layers is tansig and the default for the output layer is purelin. The default training function is trainlm.

For example, the following commands create a two-layer network. To create a network, you provide typical input and output values that initialize weight and bias values and determine the size of the output layer. Assume three input vectors with two elements having values [-1;0], [2;5] and [1; 1]. Assume the output vector to have a single element shown below as t. (These are arbitrary numbers. For a real problem, use real values.)

There are three neurons in one hidden layer. The default transfer functions for hidden layers is tan-sigmoid, and for the output layer is linear.

This command creates the network object and also initializes the weights and biases of the network; therefore the network is ready for training. There are times when you might want to reinitialize the weights, or to perform a custom initialization. The next section explains the details of the initialization process.

Other Architectures for Backpropagation

While two-layer feed-forward networks can potentially learn virtually any input-output relationship, feed-forward networks with more layers might learn complex relationships more quickly.

The function newcf creates cascade-forward networks. These are similar to feed-forward networks, but include a weight connection from the input to each layer, and from each layer to the successive layers. For example, a three-layer network has connections from layer 1 to layers 2, layer 2 to layer 3, and layer 1 to layer 3. The three-layer network also has connections from the input to all three layers. The additional connections might improve the speed at which the network learns the desired relationship.

Other networks can learn dynamic or time-series relationships. They are introduced in Dynamic Networks.

Initializing Weights (init)

Before training a feedforward network, you must initialize the weights and biases. The newff command automatically initializes the weights, but you might want to reinitialize them. You do this with the init command. This function takes a network object as input and returns a network object with all weights and biases initialized. Here is how a network is initialized (or reinitialized):

For specifics on how the weights are initialized, see Advanced Topics.


 Provide feedback about this page 

Previous page Solving a Problem Simulation (sim) Next page

Recommended Products

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.

 © 1984-2009- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS