| Products & Services | Industries | Academia | Support | User Community | Company |
| Download Product Updates | | | Get Pricing | | | Trial Software |
| Documentation → Neural Network Toolbox |
| Contents | Index |
Preprocessing and Postprocessing
Neural network training can be made more efficient if you perform certain preprocessing steps on the network inputs and targets. This section describes several preprocessing routines that you can use. The most common of these are provided automatically when you create a network.
Network-input processing functions transform inputs into a better form for the network use. Processing functions associated with a network output transform targets into a better form for network training, and reverse transformed outputs back to the characteristics of the original target data.
Most of the network creation functions in the toolbox, including the backpropagation network creation functions such as newff, automatically assign processing function to your network inputs and outputs. These functions transform the input and target values you provide into values that are better suited for the network.
You can override the default input and output processing functions when you call a network creation function, or by adjusting network properties after you create the network.
To see a cell array list of processing functions assigned to the input of a network, access this property:
To view the processing functions returned by the output of a two-layer network, access this network property:
You can use these properties to change the processing functions you want your network to apply to the inputs and outputs. However, we recommend that you use the defaults.
Several processing functions have parameters that customize their operation. You can access or change the parameters of the ith input processing function for the network input as follows:
You can access or change the parameters of the ith output processing function for the network output associated with the second layer, as follows:
For backpropagation network creation functions, such as newff, the default functions are fixunknowns, removeconstantrows and mapminmax. For outputs, the default processing functions are removeconstantrows and mapminmax.
Min and Max (mapminmax)
Before training, it is often useful to scale the inputs and targets so that they always fall within a specified range. The function mapminmax scales inputs and targets so that they fall in the range [-1,1]. The following code illustrates how to use this function.
The original network inputs and targets are given in the matrices p and t. The normalized inputs and targets pn and tn that are returned will all fall in the interval [-1,1]. The structures ps and ts contain the settings, in this case the minimum and maximum values of the original inputs and targets. After the network has been trained, the ps settings should be used to transform any future inputs that are applied to the network. They effectively become a part of the network, just like the network weights and biases.
If mapminmax is used to scale the targets, then the output of the network will be trained to produce outputs in the range [-1,1]. To convert these outputs back into the same units that were used for the original targets, use the settings ts. The following code simulates the network that was trained in the previous code, and then converts the network output back into the original units.
The network output an corresponds to the normalized targets tn. The unnormalized network output a is in the same units as the original targets t.
If mapminmax is used to preprocess the training set data, then whenever the trained network is used with new inputs they should be preprocessed with the minimum and maximums that were computed for the training set stored in the settings ps. The following code applies a new set of inputs to the network already trained.
Mean and Stand. Dev. (mapstd)
Another approach for scaling network inputs and targets is to normalize the mean and standard deviation of the training set. The function mapstd normalizes the inputs and targets so that they will have zero mean and unity standard deviation. The following code illustrates the use of mapstd.
The original network inputs and targets are given in the matrices p and t. The normalized inputs and targets pn and tn that are returned will have zero means and unity standard deviation. The settings structures ps and ts contain the means and standard deviations of the original inputs and original targets. After the network has been trained, you should use these settings to transform any future inputs that are applied to the network. They effectively become a part of the network, just like the network weights and biases.
If mapstd is used to scale the targets, then the output of the network is trained to produce outputs with zero mean and unity standard deviation. To convert these outputs back into the same units that were used for the original targets, use ts. The following code simulates the network that was trained in the previous code, and then converts the network output back into the original units.
The network output an corresponds to the normalized targets tn. The unnormalized network output a is in the same units as the original targets t.
If mapstd is used to preprocess the training set data, then whenever the trained network is used with new inputs, you should preprocess them with the means and standard deviations that were computed for the training set using ps. The following commands apply a new set of inputs to the network already trained:
Principal Component Analysis (processpca)
In some situations, the dimension of the input vector is large, but the components of the vectors are highly correlated (redundant). It is useful in this situation to reduce the dimension of the input vectors. An effective procedure for performing this operation is principal component analysis. This technique has three effects: it orthogonalizes the components of the input vectors (so that they are uncorrelated with each other), it orders the resulting orthogonal components (principal components) so that those with the largest variation come first, and it eliminates those components that contribute the least to the variation in the data set. The following code illustrates the use of processpca, which performs a principal-component analysis using the processing setting maxfrac of 0.02.
The input vectors are first normalized, using mapstd, so that they have zero mean and unity variance. This is a standard procedure when using principal components. In this example, the second argument passed to processpca is 0.02. This means that processpca eliminates those principal components that contribute less than 2% to the total variation in the data set. The matrix ptrans contains the transformed input vectors. The settings structure ps2 contains the principal component transformation matrix. After the network has been trained, these settings should be used to transform any future inputs that are applied to the network. It effectively becomes a part of the network, just like the network weights and biases. If you multiply the normalized input vectors pn by the transformation matrix transMat, you obtain the transformed input vectors ptrans.
If processpca is used to preprocess the training set data, then whenever the trained network is used with new inputs, you should preprocess them with the transformation matrix that was computed for the training set, using ps2. The following code applies a new set of inputs to a network already trained.
pnewn = mapstd('apply',pnew,ps1); pnewtrans = processpca('apply',pnewn,ps2); a = sim(net,pnewtrans);
Principal component analysis is not reliably reversible. Therefore it is only recommended for input processing. Outputs requires reversible processing functions.
Processing Unknown Inputs (fixunknowns)
If you have input data with unknown values, you can represent them with NaN values. For example, here are five 2-element vectors with unknown values in the first element of two of the vectors:
The network will not be able to process the NaN values properly. Use the function fixunknowns to transform each row with NaN values (in this case only the first row) into two rows that encode that same information numerically.
Here is how the first row of values was recoded as two rows.
The first new row is the original first row, but with the mean value for that row (in this case 2) replacing all NaN values. The elements of the second new row are now either 1, indicating the original element was a known value, or 0 indicating that it was unknown. The original second row is now the new third row. In this way both known and unknown values are encoded numerically in a way that lets the network be trained and simulated.
Whenever supplying new data to the network, you should transform the inputs in the same way, using the settings ps returned by fixunknowns when it was used to transform the training input data.
The function fixunkowns is only recommended for input processing. Unknown targets represented by NaN values can be handled directly by the toolbox learning algorithms. For instance, performance functions used by backpropagation algorithms recognize NaN values as unknown or unimportant values.
Representing Unknown or Don't Care Targets
Unknown or "don't care" targets can also be represented with NaN values. We do not want unknown target values to have an impact on training, but if a network has several outputs, some elements of any target vector may be known while others are unknown. One solution would be to remove the partially unknown target vector and its associated input vector from the training set, but that involves the loss of the good target values. A better solution is to represent those unknown targets with NaN values. All the performance functions of the toolbox will ignore those targets for purposes of calculating performance and derivatives of performance.
Posttraining Analysis (postreg)
The performance of a trained network can be measured to some extent by the errors on the training, validation, and test sets, but it is often useful to investigate the network response in more detail. One option is to perform a regression analysis between the network response and the corresponding targets. The routine postreg is designed to perform this analysis.
The following commands illustrate how to perform a regression analysis on the network trained in Summary and Discussion of Early Stopping and Regularization.
The network output and the corresponding targets are passed to postreg. It returns three parameters. The first two, m and b, correspond to the slope and the y-intercept of the best linear regression relating targets to network outputs. If there were a perfect fit (outputs exactly equal to targets), the slope would be 1, and the y-intercept would be 0. In this example, you can see that the numbers are very close. The third variable returned by postreg is the correlation coefficient (R-value) between the outputs and targets. It is a measure of how well the variation in the output is explained by the targets. If this number is equal to 1, then there is perfect correlation between targets and outputs. In the example, the number is very close to 1, which indicates a good fit.
The following figure illustrates the graphical output provided by postreg. The network outputs are plotted versus the targets as open circles. The best linear fit is indicated by a dashed line. The perfect fit (output equal to targets) is indicated by the solid line. In this example, it is difficult to distinguish the best linear fit line from the perfect fit line because the fit is so good.
| Provide feedback about this page |
![]() | Improving Generalization | Sample Training Session | ![]() |

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.
| © 1984-2009- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |