Logsig activation function in irradiance post-processing
Show older comments
Hello,
I have irradiance and temperature forecasts, and I'm trying to improve the irradiance forecasts using a neural network for my Master's thesis. For that, I'm using fitnet function (is this MLP?). I'm currently testing one and two hidden layer networks with different sizes.
My question is mainly about the activation functions in the hidden layers and in the output layer. I have normalized the irradiance (both the forecasted and the targets) and the temperature, and they are ranging from 0 to 1 (at least for irradiance, it is the normalization range used - it doesn't make sense to use from -1 to 1). As such, I have removed mapminmax from preprocessing:
net.input.processFcns = {'removeconstantrows'};
net.output.processFcns = {'removeconstantrows'};
It makes sense, right?
Additionally, having read the NN's User's Guide, I saw that the default transfer function is tansig, which outputs in the range of [-1; 1], and changed it to logsig [0; 1]. I read in the guide that "...if you want to constrain the outputs of a network (such as between 0 and 1), then the output layer should use a sigmoid transfer function (such as logsig).". My problem is that I don't see different results when using tansig and logsig. I actually think (this was not thoroughly tested yet) that logsig in the output layer provides slightly worse results. Do these results make sense? And does it make sense to use logsig?
net.layers{1}.transferFcn = 'logsig';
net.layers{2}.transferFcn = 'logsig';
net.layers{3}.transferFcn = 'logsig'; %If using 2 hidden layers
Also, is it important (and even possible) to "tell" the network the the input1 is irradiance, same as the only output? (I mean, should the network know that I have G and T as inputs and G as output, or it is completely irrelevant and it treats it like X and Y as inputs and Z as output?).
One last question: I have used
net.divideFcn = 'divideint'; %Interleaved division
Does this guarantee that the in all trainings the test set is composed of the same elements (for example, entries 7, 14 and 21 are always used in test)?
I'm sorry for the long post, I really hope someone can enlighten me! If it matters, I'm attaching my data and code.
Thank you,
Bernardo Fonseca
1 Comment
Greg Heath
on 2 May 2016
fitnet is a MLP
1 hidden layer is sufficient for a universal approximator
Scaled inputs should be relatively symmetric about 0 and hidden node transfer functions should be tansig(tanh)
Outliers should be removed or modified. It is easier when inputs and targets are first standardized to zero-mean & unit-variance
Output transfer functions are usually linear unless there are mathematical or physical reasons why the outputs should be bounded. Then logsig or tansig may be appropriate.
You have the choice of removing the default net normalizations or leaving them in and using the appropriate normalizations before and/or after calling the net.
For a classifier, outputs should be estimates of the output probabilities conditional on the input. Softmax is appropriate for exclusive classes with {0,1) unit vector targets. Logsig is appropriate for nonexclusive classes with nonnegative unit sum targets.
Divideint yields 1,4,7... for training data, 2,5,8.. for validation data and 1,6,9,.. for test data.
Hope this helps.
Greg
Accepted Answer
More Answers (0)
Categories
Find more on Define Shallow Neural Network Architectures in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!