Problems about weights in neural network training

1 view (last 30 days)
Hi all!
I am using neural network models in matlab, and now I am facing a problem about the weights in NN training.
Basically, I have a multiple inputs multiple outputs recurrent neural network and the network is generated as
net = narxnet(inputDelays,feedbackDelays,hiddenLayerSize),
and the corresponding mathematical model is like
Y(k+1) = A*Y(k) + f(U),
where U is the input of the model and Y is the output (which also feedback to the input), A is a unknown constant matrix and f() is a unknown nonlinear function of U .
The whole training process is fine, and I can get very small training error. However, when I want to use this network for a prediction, I have a problem, which is that, the influence from the feedback is too much.
For example, let's say the network is net , and I have two different inputs U1 = [0 1 0] and U2 = [1 1 1] . When I want to have a one step prediction regarding to these two different inputs, I use
netc = closeloop(net);
u1 = tonndata(U1,false,false);
u2 = tonndata(U2,false,false);
[a1,b1,c1,d1] = preparets(netc,u1,{},targets);
[a2,b2,c2,d2] = preparets(netc,u2,{},targets);
outputs1 = netc(a1,b1,c1);
outputs2 = netc(a2,b2,c2);
The result is that, outputs1 is exactly identical to outputs2 . I tried a lot of times and used all different inputs. If the feedback Y(k) is the same, the final predicted output Y(k+1) is always the same. I guess this might because of the noise from the training data, but I do have a very good training data set with very small noises.
So now I am wondering if there is any method that, I can increase the influence from the input U and decrease the influence from Y , meanwhile keeping a certain training accuracy.
Thank you very much for the help!

Accepted Answer

Greg Heath
Greg Heath on 24 Oct 2013
After you close the loop:
Test the CL net on the original data. It probably will not perform as well as the OL loop net.
If dissatisfied, train the CL design on the original data using the final OL weights as the initial weights for the CL training.
Check my recent posts in the NEWSREADER and ANSWERS re closeloop designs.
Thank you for formally accepting my answer
Greg
  4 Comments
Yiming
Yiming on 28 Oct 2013
Hi Greg,
thanks for the comment.
1. I did simulations with different lags. The auto-/cross-correlation functions show perfect shapes when lag = 1, which is also following my mathematical model.
2. I also tried different scenarios with timedelaynet and narxnet. I manually added the output feedback as part of the input of a timedelaynet, and also tested different lags. In my opinion, I think it should be equal to narxnet, and the result agrees with that.
3. I read some materials about how to define the number of nodes in the hidden layer. Unfortunately, there is no clear determination about that. I did try to use different numbers of nodes, but there was no big difference in the results.
4. For the time-series data, in fact I don't understand why it could not be randomly divided. According to my knowledge, I think after execution of 'preparets', the corresponding 'Xs Xi Ai Ts' should be automatically arranged with the correct inputs/feedbacks/states, right? Or do you mean that, the randomly divided layer/input states will also influence the perforamnce?
5. Basically, I know NN with one hidden layer should be able to identify all linear/nonlinear functions, while I also realize for a complicated MIMO system, maybe it's very difficult to find the global optimum. So I will train the network with random initializations for multiple times (50 or 100). Hopefully it will get better results.
Thank you for your help.
Yiming
Greg Heath
Greg Heath on 30 Jul 2014
1. According to the given equation, the correct delay inputs should be
ID=0, FD=1
2. I cannot comment on your choice of H=15 because I don't know
[ I N ] = size(input)
[ O N ] = size(target)
Hub = -1+ceil( (0.7*N*O-O) / (I + O +1) )
3. The training success is determined by nontraining performance, MSEval and MSEtst, not training performance MSEtrn
4. You should not use 'dividerand' because it destroys the inherent auto and crosscorrelations of the original data. Your net may accomodate the order encountered in scrambled training data. However, don't expect it to work for nontraining data.
5. I suggested the comparison with timedelaynet and narnet to answer your question of input vs feedback importance. That is not what you did.
Sorry for the delayed response. It was not intentional
Greg

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!