| Products & Services | Solutions | Academia | Support | User Community | Company |
| Download Product Updates | | | Get Pricing | | | Trial Software |
| Documentation → Neural Network Toolbox |
| Contents | Index |
Solving a Problem
This section demonstrates the common steps of solving a problem with backpropagation.
The first step is to define your problem. For supervised networks, such as feed-forward networks trained with backpropagation, this means a set of input vectors and a set of associated desired output vectors called target vectors.
The file housing.mat contains a predefined set of input and target vectors. The input vectors define data regarding real-estate properties and the target values define relative values of the properties. Load the data using the following command:
Loading this file creates two variables. The input matrix houseInputs consists of 506 column vectors of 13 real estate variables for 506 different properties. The target matrix houseTargets consists of the corresponding 506 relative valuations.
The next step is to create a network and train it until it has learned the relationship between the example inputs and targets.
The most common network used with backpropagation is the two-layer feed-forward network. The following call to newff creates a two-layer network with 20 neurons in the hidden layer. (The number of neurons in the output layer are automatically set to one, the number of elements in each vector of t.)
Two-layer feed-forward networks can potentially represent any input-output relationship with a finite number of discontinuities (which is a typical relationship you might want to model), assuming that there are enough neurons in the hidden layer.
Your next step is to train the network using the data.
Click the Performance plot button in the training window to see a plot that resembles the following figure.
The plot shows the mean squared error of the network starting at a large value and decreasing to a smaller value. In other words, it shows that the network is learning.
The plot has three lines, because the 506 input and targets vectors are randomly divided into three sets. 60% of the vectors are used to train the network. 20% of the vectors are used to validate how well the network generalized. Training on the training vectors continues as long the training reduces the network's error on the validation vectors. After the network memorizes the training set (at the expense of generalizing more poorly), training is stopped. This technique automatically avoids the problem of overfitting, which plagues many optimization and learning algorithms.
Finally, the last 20% of the vectors provide an independent test of network generalization to data that the network has never seen.
After training the network, you can use it. Use sim to apply the network to the original vectors:
You can now apply the network to similar data. Assuming that the data used to train the network represents the general problem of pricing properties, you can expect the accuracy of the network for new data to be similar to the accuracy for test data during training.
Improving Results
The house_dataset example demonstrated some simple commands you can use to solve many types of problems. However, if your first attempt does not meet your needs or expectations, this section describes some ways to improve network accuracy.
If the network is not sufficiently accurate, you can try initializing the network and the training again. Each time your initialize a feed-forward network, the network parameters are different and might produce different solutions.
As a second approach, you can increase the number of hidden neurons above 20. Larger numbers of neurons in the hidden layer give the network more flexibility because the network has more parameters it can optimize. (Increase the layer size gradually. If you make the hidden layer too large, you might cause the problem to be under-characterized and the network must optimize more parameters than there are data vectors to constrain these parameters.)
Finally, try using additional training data. Providing additional data for the network is more likely to produce a network that generalizes well to new data.
Under the Hood
The remainder of this chapter describes additional techniques for improving network performance. These advanced techniques require you to know how neural networks transform inputs into outputs and how backpropagation training works.
First, the feed-forward network architecture is introduced.
Networks that you train using backpropagation can have more than two hidden layers, which can make learning complex relationships easier for the network. Other architectures add more connections, which might help networks learn.
The default backpropagation training algorithm is Levenberg-Marquardt (trainlm). This is the fastest method in the toolbox, but it can use large amounts of memory. There are ways to reduce this memory requirement, and there are other algorithms that require less memory and might improve generalization.
Another way you might improve generalization is by modifying the default method that divides the data into training data, validation data, and test data. You can replace the default function dividerand by other methods, or you can change the relative percentages of vectors associated with dividerand to values other than 60%, 20% and 20%.
By default, networks trained with backpropagation have three input processing functions that automatically apply to network input data by sim or train. The first function fixunknowns re-encodes unknown input values represented by NaN values into numerical values so the network can operate on the values directly. The second function removeconstantrows removes inputs with values that are the same for all input vectors used to create the network. Such inputs contain no information and can cause numerical problems during network initialization or training. The third function mapminmax maps the range of input values to the range [-1 1], or normalizes the input values. Training is often faster when values are normalized.
removeconstantrows and mapminmax are also the default output processing functions.
There are other input/output processing functions that can be used in addition to or instead of the default functions for particular kinds of problems.
| Provide feedback about this page |
![]() | Introduction | Architecture | ![]() |

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.
| © 1984-2009- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |