ALWAYS use TANSIG for hidden layers and normalize inputs so that the means are approximately zero. Let the levels be controlled by b1. I favor zero-mean/unit-variance inputs via mapstd or zscore BEFORE calling CONFIGURE or TRAIN. This not only prevents the learning from being dominated by low significance inputs with large magnitudes (possibly, via sigmoid saturation), it also allows easy recognition of outliers which may require additional preprocessing (e.g., truncation or deletion).
For pattern recognition the target matrix should contain unit vector columns with 1 unity component and the rest zeros. VEC2IND and IND2VEC allow easy transformation to and from the integer class indices. The corresponding output transfer function should be PURELIN, LOGSIG or SOFTMAX. This allows the outputs to be interpreted as consistent estimates of posterior probabilities, conditional on the input, even though only SOFTMAX constrains the outputs to sum to 1.
However, MATLAB defaults are different.
1. There is no checking for outliers.
2. Using the default command
net = patternnet
will list all of the net default properties.
3. In particular
inputprocessfunctions = net.inputs{1}.processFcns
outputprocessfunctions = net.outputs{2}.processFcns
removes constant variables and maps inputs and outputs to the closed range [-1 -1 ]
also
trainfunction = net.trainFcn
(which I know works well with the unscaled integer targets of zeros and ones) probably works well with the scaled integer targets of minus ones and plus ones.
in addition
layer1transferfunction = net.layers{1}.transferFcn
layer2transferfunction = net.layers{2}.transferFcn
which is fine for the [-1 -1] default scaled output.
Your code does not allow for these defaults.
There are several ways to go. You could override all of the MATLAB defaults that are not compatible with my two paragraph beginning explanation. However, the best thing to do is try to accomodate the MATLAB defaults as much as possible.
My suggestion:
1. If there is a possibility of outliers
a. Use ZSCORE to standardize inputs and check for outliers using MINMAX. Truncate or delete outliers depending on your particular problem. Constant rows with zero variance will be converted to rows of zeros.
b. Once the outlier question is resolved you can
i. Either keep the standardized variables
ii. Or transform back to the original variables.
2. Convert outputs to the unit column format (help ind2vec)
3. Initialize the RNG in case you want to duplicate the following runs
4. Use PATTERNNET with defaults.
5. If you are going to design multiple nets in a loop over random initial weights and/or include an outer loop to search multiple candidates for the best choice for number of hidden nodes,
a. Save the initial state of the RNG before each design.
b. Initialize the weights using CONFIGURE before using TRAIN
6. If you are performing classification or pattern recognition, the nets are ranked by error rate, NOT the mse performance function!
RNGstate(i,j) = rng;
net = configure(net,input,target);
[ net tr output ] = train(net,input,target);
trueclass = vec2ind(target);
N =length(trueclass)
assignedclass = vec2ind(output);
Nerr = sum(assignedclass~=trueclass);
PctErr(i,j) = 100*Nerr/N;
8. If you need a breakdown of train, val, and test errors for each class, use the training record tr. See the properties of tr via the command
9. Once a design is chosen, rerun and save
a. The input and output settings from using mapminmax on the
training data
b. The weights
10. To use the weights with analytic formulas instead of the net and/or sim function:
a. Normalize inputs and outputs using the mapminmax settings of the training data.
b. Use the double default tansig formula to get normalized outputs from normalized inputs
c. Use the output settings to get the unnormalized output
d. Compare with the original obtained from the net.
11. First try to understand by using the default value for number of hidden nodes and only one choice of initial weights. Regardless of whether it is a good design or not, compare the two methods. Once they match, you can search for the best design using a double loop.
Hope this helps.
Thank you for formally accepting my answer
Greg