Normalization inputs data & dividing data for training - validation- test
Show older comments
Could you help me please I have two questions about neural networks for solar irradiance forecasting. I used MLP model (Fitting) with one hidden layer, 7 inputs and 1 output (solar irradiation).My questions are the following : - It's necessary to use these following commands to normalize my inputs data ?? (I use a sigmoid function as activation function in hidden layer, and linear function in the ouput layer)
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};
net.outputs{2}.processFcns = {'removeconstantrows','mapminmax'};
Or I can just use the simple mathematical formula : In=(Inn-Imin)/(Imax-Imin)
while In: normalized input ; Inn: No normalized input ???
- Second question is about dividing data for training, this is my code about dividing :
inputs = A'; % used for training
targets = B'; % used for training
inputsTesting=C'; % used for test unseen by neural network
targetsTesting=D'; %used for test unseen by neural network
% Setup Division of Data for Training, Validation, Testing
net.divideFcn = 'dividerand'; % Divide data randomly
net.divideMode = 'sample'; % Divide up every sample
net.divideParam.trainRatio = 75/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 10/100;% this is my problem !!!!!
|*% Create a Fitting Network*|
net=fitnet(Nubmer of nodes in haidden layer);
% tarining
net.trainFcn = 'trainlm'; % Levenberg-Marquardt
[net,tr] = train(net,inputs,targets);
outputs = net(inputsTesting); % inputs Testing :unseen by neural network
perf = mse(net,targetsTesting,outputs); % targets Testing: unseen by network
My question is what does mean this command below ???I think this command is unnecessary because i used data testing unseen by network?? !!! So what i can do about this mistak ?? !!!!
net.divideParam.testRatio = 10/100;
Neural network use 10% of data alerady seen for testing ??
please Help
best regards
1 Comment
Greg Heath
on 15 Feb 2015
Edited: Greg Heath
on 15 Feb 2015
% REPLY 15FEB2015 % Normalization inputs data & dividing data for training - validation- test % Asked by omar belhaj about 21 hours ago % % Could you help me please I have two questions about neural networks % for solar irradiance forecasting. I used MLP model (Fitting) with one % hidden layer, 7 inputs and 1 output (solar irradiation).My questions % are the following : - It's necessary to use these following commands % to normalize my inputs data ?? (I use a sigmoid function as activation % function in hidden layer, and linear function in the ouput layer) % net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'}; % net.outputs{2}.processFcns = {'removeconstantrows','mapminmax'}; % % Or I can just use the simple mathematical formula : In=(Inn-Imin)/(Imax-Imin) % % while In: normalized input ; Inn: No normalized input ???
Replace "while" with "where"
The current NN creation functions automatically use MAPMINMAX. So, as in my example below, you do not have to scale your data. On the other hand, you can use the above commands to either
a. Use MAPSTD (zero-mean/unit-variance) for standardization
instead of MAPMINMAX
b. Remove scaling
Although I prefer standardization before training to
a. better deal with data errors and outliers,
b. estimate significant delays for time-series design using correlation functions (help nncorr)
I just use ZSCORE AND THEN let the program use the default MAPMIMAX
% - Second question is about dividing data for training, this is my code about dividing :
% inputs = A'; % used for training
% targets = B'; % used for training
% inputsTesting=C'; % used for test unseen by neural network
% targetsTesting=D'; %used for test unseen by neural network % % % Setup Division of Data for Training, Validation, Testing % % net.divideFcn = 'dividerand'; % Divide data randomly
% net.divideMode = 'sample'; % Divide up every sample
% net.divideParam.trainRatio = 75/100;
% net.divideParam.valRatio = 15/100;
% net.divideParam.testRatio = 10/100;% this is my problem !!!!!
The current NN creation functions automatically use the default DIVIDERAND with the fractional breakdown of 70/15/15 for trn/val/tst.
Only the training data is used to change weights. The validation data is only used to prevent bad performance on nontraining data. The net does not, in any way, use the test data for design. Therefore, there is no reason to use an extra "unseen" data set.
So, as in my example below, you do not have to explicitly divide your data.
However, I prefer to use DIVIDEBLOCK for timeseries to preserve the constant time delay correlations deduced from correlation functions.
% % Create a Fitting Network % % net=fitnet(Nubmer of nodes in haidden layer); % % % tarining % net.trainFcn = 'trainlm'; % Levenberg-Marquardt % [net,tr] = train(net,inputs,targets); % outputs = net(inputsTesting); % inputs Testing :unseen by neural network % perf = mse(net,targetsTesting,outputs); % targets Testing: unseen by network % % My question is what does mean this command below ??? % I think this command is unnecessary because i used data t % esting unseen by network?? !!! So what i can do about this % mistake ?? !!!! % net.divideParam.testRatio = 10/100; % Neural network use 10% of data already seen for testing ??
The test data, IN NO WAY influences the design! That is why it is called TEST data!!
Therefore there is no reason to explicitly hold out data for testing.
However, you can change the ratios and type of division if you wish; Just make sure they add up to 1.
Accepted Answer
More Answers (0)
Categories
Find more on Deep Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!