How to correctly use adapt with recurrent neural network training?

14 views (last 30 days)
Hi all,
I am using the neural network toolbox to do the system identification for a multiple inputs multiple outputs system. This MIMO system was represented by a recurrent neural network (NARX model), and I want to achieve a kind of online training by using the function 'adapt'. But I am not exactly sure how to arrange the inputs and targets of adapt.
For example, the system is like: Y(k+1) = A*Y(k) + B*U(k), where Y(k) and U(k) are the output and input matrix of the system at time k. When I use adapt, should I use
[net,Y,E,Pf,Af,tr] = adapt(net,U(k),Y(k),Pi,Ai)
or
[net,Y,E,Pf,Af,tr] = adapt(net,U(k),Y(k+1),Pi,Ai)
or some other form?
Thank you very much for the help!

Accepted Answer

Greg Heath
Greg Heath on 8 Oct 2013
close all, clear all, clc, plt=0
tic
[X,T] = simplenarx_dataset;
net = narxnet(1:2,1:2,10)
adaptFcn = net.adaptFcn
adaptParam = net.adaptParam
view(net)
[Xs,Xi,Ai,Ts] = preparets(net,X,{},T);
ts = cell2mat(Ts);
MSE00 = var(ts,1) % 0.099154
whos
rng(0)
Ntrials = 100
for i = 1:Ntrials
% net = configure(net,Xs,Ts);
% [net tr Ys Es Xf Yf ] = train(net,Xs,Ts,Xi,Ai);
[net Ys Es Xf Yf tr] = adapt(net,Xs,Ts,Xi,Ai);
% view(net)
% whos
R2(i,1) = 1-mse(Es)/MSE00 % 1
% plt=plt+1,figure(plt)
% plot(1-tr.perf/MSE00,'LineWidth',2)
% ylim([ -0.1 1.1])
% xlabel( ' EPOCH ')
% title('COEFFICIENT OF DETERMINATION vs EPOCH')
%
% ys = cell2mat(Ys);
% plt=plt+1,figure(plt)
% hold on
% plot(ts,'LineWidth',2)
% plot(ys,'r--','LineWidth',2)
% xlabel( ' DATA INDEX ')
% title( 'TARGET (Blue) AND OUTPUT (Red) ')
end
toc
1. Set Ntrials = 10 and comment the adaptation statement
2. Run to see the results of 10 independent designs in 20 figures
3. Note that all R^2 ~ 1
4. Total Elapsed Time ~ 12.6 sec
5. Uncomment the adaptation statement
6. Comment the configuration and train statements
7. Run to see the results of 10 stages of 1 design in 20 figures
8. Note that R^2 is monotonically increasing.
R2(1:10) = -0.0079 0.6435 0.7230 0.7599 0.7845
0.8027 0.8168 0.8282 0.8374 0.8451
9. Total Elapsed Time ~ 12.5 sec
10. When Ntrials = 100, for the adaptation, R^2 values kept increasing, monotonically, to ~ 0.95. Then my computer ran out of memory.
11. Rerunning without plots takes 79.5 sec with R2(10:10:100) = 0.8451 0.8846 0.9031 0.9160 0.9258 0.9335 0.9395 0.9441 0.9477 0.9505
12. Handwaving conclusion:
Adaptation takes ~50 to 100 times longer to get the same number of good designs.
Hope this helps.
Thank you for formally accepting my answer
Greg
  2 Comments
Yiming
Yiming on 9 Oct 2013
Edited: Yiming on 9 Oct 2013
Hi Greg,
Thank you for your great reply.
I followed your procedures and ran your codes. I an not sure if I understand your codes correctly.
1. From my point of view, I think this R2 can be interpreted as the mean performance of this training iteration, and the plotted '1-tr.perf/MSE00' is the performance for each individual epoches, right? By using exactly the same dataset, after multiple trainings, the result is getting better until it reaches a limit. Is it working like we give good initialization to the network and it will lead to a better result?
2. From the results of adaptation, although R2 is getting better and better, if we focus individual samples of this dataset, it's possible that the result could be worse. For example, let's say I have 50 samples (input-output pairs) in the dataset, I can't determine if the result after 35 samples is better than the one after 20 samples, is it correct? Then how do we get the best result from adaptation?
3. If I would like to achieve a sample by sample online training, just to make the result better and better, do you think it's possible? Is there any other way to do it?
Thank you so much for your help!
Best regards,
Yiming
Greg Heath
Greg Heath on 9 Oct 2013
1. R2 aka R^2 or Rsquared is the coefficient of determination used in statistics. It is interpreted as the amount of target variance "explained" by the model. See Wikipedia.
Obviously, it tells you more than the unscaled value perf. Although, you might want to use the normalized MSE, NMSE = perf/MSE00.
I never use adaptation because it is so many orders of magnitude slower than batch training.
Since net.adaptParam = 'none' for all the nets I've looked at, it looks like MATLAB is not interested in improving adapt.
If I had to continually update a model with real time data, I would periodically use batch training.
Hope this helps

Sign in to comment.

More Answers (1)

Greg Heath
Greg Heath on 8 Oct 2013
Edited: Greg Heath on 8 Oct 2013
1. Solve the problem using train
help narxnet
help closeloop
2. Make sure you initialize the RNG so that you can repeat the best of multiple random weight initialization designs.
3. When finished, substitute adapt for train and repeat.
Hope this helps.
Thank you for formally accepting my answer.
Greg
  2 Comments
Yiming
Yiming on 8 Oct 2013
Edited: Yiming on 8 Oct 2013
Hi Greg,
Thank you for the answer.
Actually that's exactly what I did. The whole program is very long, and I just put some brief codes as following:
1. Firstly, I generate the NARX model and use some former experiment data to do a batch-training:
net = narxnet(inputDelays,feedbackDelays,hiddenLayerSize,feedbackmode);
[net,tr] = train(net,inputs,targets,inputStates,layerStates);
2. Secondly, I use this generated network 'net' as the initialized model and start a incremental training, by using adapt.
in_ad = tonndata(input,false,false);
out_ad = tonndata(output,false,false);
net = net;
[in_adapt,a,b,out_adapt] = preparets(net,in_ad,{},out_ad);
[net_adapt,y_adapt,e_adapt,Pf,Af,tr_online] = adapt(net,in_adapt,out_adapt);
3. The problem is for this 'input' and 'output', should I put U(k),Y(k) or [U(k-1),U(k)], [Y(k-1),Y(k)]? In fact, I am not sure if this online training using adapt will lead to a better result. After several simulations, I got really bad results by using this online training, which means the prediction result of 'net_adapt' is much worse than 'net'. Is it normal? Did I do anything wrong?
4. In your answer, you mentioned closeloop. Do you mean I should generate a closed loop NN or close the loop after batch-training? How could I use adapt in a closed loop NN?
Thank you very much for your help!
Greg Heath
Greg Heath on 17 Oct 2013
The operational configuration of feedback timeseries NNs is the closed loop configuration that accepts past outputs. In contrast, the openloop configuration is only used for design because the feedback signal is the desired target.
See my latest posts in the NEWSREADER and ANSWERS re closeloop
neural greg closeloop
Hope this helps.

Sign in to comment.

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!