Recognizing Patterns
In addition to function fitting, neural networks are also good at recognizing patterns.
For example, suppose you want to classify a tumor as benign or malignant, based on uniformity of cell size, clump thickness, mitosis, etc. [MuAh94]. You have 699 example cases for which you have 9 items of data and the correct classification as benign or malignant.
As with function fitting, there are three ways to solve this problem:
Defining a Problem
To define a pattern recognition problem, arrange a set of Q input vectors as columns in a matrix. Then arrange another set of Q target vectors so that they indicate the classes to which the input vectors are assigned. There are two approaches to creating the target vectors.
One approach can be used when there are only two classes; you set each scalar target value to either 1 or 0, indicating which class the corresponding input belongs to. For instance, you can define the exclusive-or classification problem as follows:
Alternately, target vectors can have N elements, where for each target vector, one element is 1 and the others are 0. This defines a problem where inputs are to be classified into N different classes. For example, the following lines show how to define a classification problem that divides the corners of a 5-by-5-by-5 cube into three classes:
- The origin (the first input vector) in one class
- The corner farthest from the origin (the last input vector) in a second class
- All other points in a third class
Classification problems involving only two classes can be represented using either format. The targets can consist of either scalar 1/0 elements or two-element vectors, with one element being 1 and the other element being 0.
The next section demonstrates how to train a network from the command line, after you have defined the problem.
Using Command-Line Functions
- Use the cancer data set as an example. This data set consists of 699 nine-element input vectors and two-element target vectors.
- Load the tumor classification data as follows:
load cancer_dataset
- Create a network. For this example, you use a pattern recognition network, which is a feed-forward network with tan-sigmoid transfer functions in both the hidden layer and the output layer. As in the function-fitting example, use 20 neurons in one hidden layer:
- The network has two output neurons, because there are two categories associated with each input vector.
- Each output neuron represents a category.
- When an input vector of the appropriate category is applied to the network, the corresponding neuron should produce a 1, and the other neurons should output a 0.
- To create a network, enter this command:
net = newpr(cancerInputs,cancerTargets,20);
- Train the network. The pattern recognition network uses the default Scaled Conjugate Gradient algorithm for training. The application randomly divides the input vectors and target vectors into three sets:
- 60% are used for training.
- 20% are used to validate that the network is generalizing and to stop training before overfitting.
- The last 20% are used as a completely independent test of network generalization.
- To train the network, enter this command:
net=train(net,cancerInputs,cancerTargets);
During training, as in function fitting, the training window opens. This window displays training progress. To interrupt training at any point, click Stop Training.

This example uses the train function. It presents all the input vectors to the network at once in a batch. Alternatively, you can present the input vectors one at a time using the adapt function. Training Styles describes the two training approaches.
This training stopped when the validation error increased for six iterations, which occurred at iteration 15.
- To find the validation error, click Performance in the training window. A plot of the training errors, validation errors, and test errors appears, as shown in the following figure. The best validation performance occurred at iteration 9, and the network at this iteration is returned.

- To analyze the network response, click Confusion in the training window. A display of the confusion matrix appears that shows various types of errors that occurred for the final trained network.
- The next figure shows the results.

The diagonal cells in each table show the number of cases that were correctly classified, and the off-diagonal cells show the misclassified cases. The blue cell in the bottom right shows the total percent of correctly classified cases (in green) and the total percent of misclassified cases (in red). The results for all three data sets (training, validation, and testing) show very good recognition. If you needed even more accurate results, you could try any of the following approaches:
- Reset the initial network weights and biases to new values with init and train again.
- Increase the number of hidden neurons.
- Increase the number of training vectors.
- Increase the number of input values, if more relevant information is available.
- Try a different training algorithm (see Speed and Memory Comparison).
In this case, the network response is satisfactory, and you can now use sim to put the network to use on new inputs.
To get more experience in command-line operations, here are some tasks you can try:
- During training, open a plot window (such as the confusion plot), and watch it animate.
- Plot from the command line with functions such as plotconfusion, plotroc, plottrainstate, and plotperform. (For more information on using these functions, see their reference pages.)
Using the Neural Network Pattern
Recognition Tool GUI
- Open the Neural Network Pattern Recognition Tool window with this command:
nprtool
- Click Next to proceed. The Select Data window opens.
- Click Load Example Data Set. The Pattern Recognition Data Set Chooser window opens.

- In this window, select Simple Classes, and click Import. You return to the Select Data window.
- Click Next to continue to the Validate and Test Data window, shown in the following figure.
- Validation and test data sets are each set to 15% of the original data.
- Click Next.
- The number of hidden neurons is set to 20. You can change this in another run if you want. You might want to change this number if the network does not perform as well as you expect.
- Click Next.
- Click Train.
- The training continues for 55 iterations.
- Under the Plots pane, click Confusion in the Neural Network Pattern Recognition Tool.
- The next figure shows the confusion matrices for training, testing, and validation, and the three kinds of data combined. The network's outputs are almost perfect, as you can see by the high numbers of correct responses in the green squares and the low numbers of incorrect responses in the red squares. The lower right blue squares illustrate the overall accuracies.
- Plot the Receiver Operating Characteristic (ROC) curve. Under the Plots pane, click Receiver Operating Characteristic in the Neural Network Pattern Recognition Tool.

- The colored lines in each axis represent the ROC curves for each of the four categories of this simple test problem. The ROC curve is a plot of the true positive rate (sensitivity) versus the false positive rate (1 - specificity) as the threshold is varied. A perfect test would show points in the upper-left corner, with 100% sensitivity and 100% specificity. For this simple problem, the network performs almost perfectly.
- In the Neural Network Pattern Recognition Tool, click Next to evaluate the network.

- At this point, you can test the network against new data.
If you are dissatisfied with the network's performance on the original or new data, you can train it again, increase the number of neurons, or perhaps get a larger training data set.
- When you are satisfied with the network performance, click Next.
- Use the buttons on this screen to save your results.
- You now have the network saved as net1 in the workspace. You can perform additional tests on it or put it to work on new inputs using the sim function.
- If you click Generate M-File, the tool creates an M-file, with commands that recreate the steps that you have just performed from the command line. Generating an M-file is a good way to learn how to use the command-line operations of the Neural Network Toolbox™ software.
- When you have saved your results, click Finish.
| | Provide feedback about this page |
 | Applications for Neural Network Toolbox™ Software | | Clustering Data |  |
Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.
Get the Interactive Kit