Trial software

How evaluate multi-step forecasting performance for large dataset?

1 view (last 30 days)

Show older comments

EanX on 14 Apr 2014

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/125702-how-evaluate-multi-step-forecasting-performance-for-large-dataset

Edited: Greg Heath on 25 Mar 2016

Accepted Answer: Greg Heath

Open in MATLAB Online

How evaluate multi-step forecasting performance for large dataset?

What I means is that to use NARX for forecasting I have to:

1. create closed loop network with

netc = closeloop(net);
2. construct target prediction data with 
targetSeriesPred = [targetSeries(end-delay+1:end), con2seq(nan(1,N))], i.e. use a number of nan equal to forecasting horizon
3. prepare data with 
[Xs,Xi,Ai,Ts] = preparets(netc,inputSeriesPred,{},targetSeriesPred);
4. get simulation/prediction results with 
yPred = netc(Xs,Xi,Ai);
5. evaluate forecasting performance (regarding olny one step-ahead prediction) with 
perf = perform(net,yPred,targetSeriesVal);

Now suppose I have a large dataset of data concerning 3 years of hourly sampled data, I can utilize only the first year for traning, validation and test phase ('divideblock' of course). I don't know how to get a unique performance value for the remaining data, 2 years, for a forecasting horizon for example of only 6 samples. I hope I to have been clear enough. Thanks in advance for your replies/suggestions.

Regards.

Sergio

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Accepted Answer

Greg Heath on 14 Apr 2014

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/125702-how-evaluate-multi-step-forecasting-performance-for-large-dataset#answer_133382

Edited: Greg Heath on 25 Mar 2016

Open in MATLAB Online

How evaluate multi-step forecasting performance for large dataset? Asked by EanX 14Apr2014

 A. Design an openloop net
  1. Use training data and nncorr or fft to estimate
   a. Significant autocorrelation output feedback lags
   b. Significant input/output crosscorrelation input lags
  2. Use trial and error to determine number of hidden layer nodes
  3. Use divideblock in narxnet to obtain multiple designs
  4. Choose the best design based on validation set error.
  5. Use the test set on the best design to obtain an UNBIASED 
     estimate of performance 
 B. Convert to a closeloop design
  1. [ netc Xci Aci ]= closeloop(neto, Xoi, Aoi)
  2. Use preparets and test netc on the original data
  3. If performance is unsatisfactory, train netc intialized with 
     the initial conditions Xoi, Aoi and final weights of neto.
  4. To predict beyond the original data, use the data at the end of 
     the test set to intialize the delay buffer and a new external input.
  5. Use preparets and netc to predict beyond the test data.

2 Comments
Show NoneHide None

EanX on 15 Apr 2014

Open in MATLAB Online

Thanks Greg, but what I mean is that for evaluate performance for, example, six steps ahead I can:

inputSeriesPred=[inputSeries(end-delay+1:end),inputSeriesVal];
targetSeriesPred=[targetSeries(end-delay+1:end), con2seq(nan(1,steps_ahead))];
[Xs,Xi,Ai,Ts]=preparets(netc,inputSeriesPred,{},targetSeriesPred);
yPred=netc(Xs,Xi,Ai);
perfCloseLooop=perform(netc,yPred,targetSeriesVal);
fprintf('Performance closed loop (%d step ahead):%d\n',steps_ahead,perfCloseLooop);
targets=cell2mat(targetSeriesVal);
outputs=cell2mat(yPred);
e= targets-outputs;
MSE=mse(e)

But this is only for six steps ahead, to evaluate a unique MSE, taking count of net's performance for six steps ahead in closed loop, not only one step ahead, for 2 years of data starting from the first used for traning validation and test, I suppose that I could iterate the above code in order to construct a global "outputs" and then avaluate error and MSE, but this require retraining the NARX at each step? This is correct?

Greg Heath on 21 Apr 2014

Open in MATLAB Online

Sorry, I don't understand.

If this is a single series you should use narnet, not narxnet.
Are the sample summary statistics (mean, variance, significant correlation coefficients and corresponding lags) time-independent?
What are the significant correlation lags?
How are you designing netc in the first place?
What ratio of trn/val/tst  are you planning to use? I doubt if Ntst > Ntrn is a reasonable goal.
Try practicing on a MATLAB example dataset
 help nndatasets
 doc nndatasets
See

http://www.mathworks.com/matlabcentral/newsreader/view_thread/332147#912806

Sign in to comment.

More Answers (0)

Sign in to answer this question.

Categories

AI, Data Science, and Statistics Deep Learning Toolbox Sequence and Numeric Feature Data Workflows

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Tags

Products

Deep Learning Toolbox

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Trial software