How could I do a Multi-step ahead Prediction without know the input serie validation.HELPPP

I am learning neural networks, I am doing some small exercises to learn it, but I have a huge question that I cannot figure it out. If I have a time series(input=X, target=T), and I am using input_training=X(1:end-N), target_train=T(1:end-N). My validation data is: input_val=X(end-N+1:end), target_val=T(end-N+1:end). I am testing a NARX, if this happen: - input_val(it is available). - target_val(it is not available). If that conditions happen I get good predictions(error<3%), but I would like to know how could I get good predictions if : - input_val(it is not available). - target_val(it is not available).
Thanks for you help....

 Accepted Answer

What data division function are you using?
Training, Validation and Testing are three separate functions. In order to obtain unbiased estimates of performance on nondesign data:
Total = Design + Test
Design = Training + Validation
Training subset:
Used to directly estimate unknown weight values ( e.g., via gradient descent)
Validation subset:
Used REPETETIVELY with Training set to determine the best set of training
parameters (e.g., No of hidden nodes, stopping epoch, selection of input and feedback delays, etc) and best of multiple random weight initialization designs.
Test subset:
Used ONCE and ONLY ONCE on the best design w.r.t. validation subset
performance to obtain an UNBIASED estimate of performance on nondesign
data (AKA generalization).
If the test set estimate is unsatisfactory, the data set should be randomly divided again and the entire procedure duplicated. Reusing the same data division biases the resulting test subset estimate.
Quite often the unbiased constraint of this procedure is violated by including the test subset in the choice of the best design. If this is done, I recommend for the sake of caution, that another round with a new random division still be performed.
The above procedure is difficult to implement with time series because uniform spacing should be maintained to preserve output-feedback autocorrelations and input-output cross-correlations.
Now, I do not understand your problem because I do not understand why you are using a validation subset without a test set to estimate nondesign performance. Posting your code with comments would help immensely.
Hope this helps.
Greg

4 Comments

Below there is one of the examples that I am studying, if:
- inputSeriesVal (it is available)
- targetSeriesVal (it is NOT available)
If we have those data I get good predictions, but how could I get good predictions if I do not have inputSeriesVal and targetSeriesVal:
- inputSeriesVal (it is NOT available)
- targetSeriesVal (it is NOT available)
EXAMPLE:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %%1. Importing data
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
S = load('magdata');
X = con2seq(S.u);
T = con2seq(S.y);
% Multi-step ahead prediction
N = 7;
%%Input and target series are divided in two groups of data:
1st group: used to train the network
inputSeries = X(1:end-N);
targetSeries = T(1:end-N);
% 2nd group: this is the new data used for simulation. inputSeriesVal will
% be used for predicting new targets. targetSeriesVal will be used for
% network validation after prediction
inputSeriesVal = X(:,end-N+1:end);
targetSeriesVal = T(:,end-N+1:end);
delay = 4;
neuronsHiddenLayer = 50;
%%Network Creation
net = narxnet(1:delay,1:delay,neuronsHiddenLayer);
%%4. Training the network
[Xs,Xi,Ai,Ts] = preparets(net,inputSeries,{},targetSeries);
net = train(net,Xs,Ts,Xi,Ai);
Y = net(Xs,Xi,Ai);
%%5. Multi-step ahead prediction
inputSeriesPred = [inputSeries(end-delay+1:end),inputSeriesVal];
targetSeriesPred = [targetSeries(end-delay+1:end), con2seq(nan(1,N))];
netc = closeloop(net);
[Xs,Xi,Ai,Ts] = preparets(netc,inputSeriesPred,{},targetSeriesPred);
yPred = netc(Xs,Xi,Ai);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%Performance
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
perf = perform(net,yPred,targetSeriesVal);
figure
plot([cell2mat(targetSeries),nan(1,N);
nan(1,length(targetSeries)),cell2mat(yPred);
nan(1,length(targetSeries)),cell2mat(targetSeriesVal)]')
legend('Original Targets','Network Predictions','Expected Outputs')
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Thanks for you help.
I guess you did not understand my first post. Let me try again;
The term validation is used to label non-training DESIGN data that is used to determine good parameters for training. Once the parameters are determined, then the validation data is used to pick the best of multiple (e.g., multiple random weight initialization) designs.
Finally, a NONDESIGN test set is used to evaluate the "best" net chosen by the validation data.
Therefore, your use of the terms "validation" and "val" are very confusing.
If this is not clear, please reread my ANSWER.
Hi Greg, I think * Yandy Perez* is saying ( in the magLev example that we have 4001 timesteps of data ) how can we predict the next N step? I mean from 4002 to 4001+N...??
You can only predict with TIMEDELAYNET and NARXNET beyond time N if you have input values beyond time N.
The alternative is to use NARNET to estimate input and/or target beyond N.
Hope this helps.
Greg

Sign in to comment.

More Answers (0)

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!