SUGGESTIONS FOR IMPROVEMENT(S) ON NARNET MULTISTEP AHEAD PREDICTIONS ON THE SOLAR DATASET?

Question

Staffan on 25 May 2016

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset

Answered: Saira AlZadjali on 14 Oct 2018

Dear Reader,

This is a continuation of the of the following newsgroup thread: http://se.mathworks.com/matlabcentral/newsreader/view_thread/345189?s_tid=srchtitle I think it is easier to use “answers” than “newsgroup” threads, mostly because of the possibility to label code and links…and also because of the possibility to attach figures (images).

INTRODUCTION

This is an analysis of how to perform multistep predictions on the solar datasets (doc nndatasets). Main improvement compared to previous iterations of this code in the added filtering of the signal. However, results indicate that the predictability can be improved. Unfortunately, it is not even possible to generate a prediction using a generated repetitive dataset which is here based on the solar dataset.

METHOD

I use a main program to have the possibility to test the 9 different training algorithms. Here is the code for main:

clear all
close all
% 1:    g_train_method='trainlm';
% 2:    g_train_method='trainbfg';
% 3:    g_train_method='trainrp';
% 4:    g_train_method='trainscg';
% 5:    g_train_method='traincgb';
% 6:    g_train_method='traincgf';
% 7:    g_train_method='traincgp';
% 8:    g_train_method='trainoss';
% 9:    g_train_method='traingdx';
for i=1:9
      current_training_method=i
[NMSEo_1(i),NMSEc(i),NMSEc2(i),toc(i)] = train_diff_method(i)
% test NMSEc - NMSEc2
end
figure;
plot(1:9,NMSEc,1:9,NMSEc2)
xlabel('Method number')
legend('NMSEc, without closed loop training','NMSEc2, with closed loop training')
figure;
plot(toc)
xlabel('Method number')
title('toc')

Then I use a function to evaluate the different training algorithms. One thing that should be noticed is that I used mapminmax to norm the signal. I read somewhere that trainbr requires this. Also, it cant do any harm if all methods use this normalized signal, at least as I see it.

    function [NMSEo_1,NMSEc,NMSEc2,totaltime] = train_diff_method(train_method_no)
    close all
    % train_method_no=1;
    if train_method_no==1
        g_train_method='trainlm'
    end
    if train_method_no==2
        g_train_method='trainbfg'
    end
    if train_method_no==3
        g_train_method='trainrp'
    end
    if train_method_no==4
        g_train_method='trainscg'
    end
    if train_method_no==5
        g_train_method='traincgb'
    end
    if train_method_no==6
        g_train_method='traincgf'
    end
    if train_method_no==7
        g_train_method='traincgp'
    end
    if train_method_no==8
        g_train_method='trainoss'
    end
    if train_method_no==9
        g_train_method='traingdx'
    end
    format shortG
    % GEH1 = [ 'START TIMING' ]
    % STJ1 = [ ' DONE ' ]
    tic
    plt=0;
    T = solar_dataset;
    t = cell2mat(T);
   % t=[t(477:2400) t(477:2400) t(477:2400)];
    % FILTERING, PLOTTING AND MAKING T A CELL ARRAY, THIS TIME BASED  ON FILTERED t 
    [b,a]=butter(2,0.1);
    t_filt=filtfilt(b,a,t);
    plt=plt+1;
    figure(plt);
    subplot(211);
    plot(1:length(t),t,1:length(t),t_filt);
    title('SOLAR DATASET, UNFILTERED AND FILTERED');
    subplot(212);
    plot(1:length(t),t-t_filt);
    title('RESIDUALS; FILTERED SIGNAL SUBRTRACTED FROM RAW SIGNAL');
    T=num2cell(t_filt);
    T_map=mapminmax(cell2mat(T)); 
    T=num2cell(T_map);
    t = cell2mat(T)
    [ O , N ] = size(t); % 'O' FOR '"o"utput
    % GEH2 = [ ' REMOVE SEMICOLON [ O N] ==> [ 1 2899 ] ' ]
    % STJ2 = [ ' DONE ' ]
    % GEH3 = [ ' TO INSURE UNBIASED NONTRAINING PREDICTIONS'...
    %               ' USE DIVIDEBLOCK AND THEN USE Ntrn in NNCORR ' ]
    % STJ3 = [ ' DONE ' ]
    % GEH4 = [ ' Ntrn = N - 2*round(0.15*N) = 2029 ' ]
    % STJ4 = [ ' OK ' ]
    % ### One thing I think of initially is the use of training algorithm,
    % should one always first use trainlm in all steps (OL and CL) or
    % is there a scientific approach that can determine (before training is
    % started) if e.g. trainbr would fit a specific signal better? ###
    % GEH5 = [ ' I DON"T KNOW. BEST TO BEGIN WITH DEFAULTS' ...
    %                 ' THEN SEE MY SPIELS ON OVERTRAINING, Hub AND MSEREG' ]
    % STJ5 = [ ' ON THE SUBJECT OF Hub I FOUND SIGNIFICANT POSTS, I HAVEN'T FOCUSED ON OVERTRAINING YET BUT I WILL LOOG INTO THIS. MSEREG SEEMS TO BE OBSOLETE, ANY SUGGESTIONS WHAT TO USE INSTEAD? ']
    MSE00 = var(t',1)
    % GEH6 = [ ' TAKE MEAN IF if O > 1 ' ] 
    % STJ6 = [ ' GOOD POINT. IN THIS CASE O is equal to 1' ]
    zt = zscore(t',1)';
    plt=plt+1;
    figure(plt);
    subplot(211);
    plot(t);
    title('SOLAR DATASET');
    subplot(212);
    plot(zt);
    title('SOLAR DATASET, STANDARDIZED')
    % [ Xo, Xoi, Aoi, To ] = preparets( neto, {}, {}, T );
    % to = cell2mat( To );
    % zto = zscore(to,1);
    % varto1 = mean(var(to',1));
    % minmaxto = minmax([ to ; zto ])
    % GEH7 = [ ' MINMAX zt CHECK FOR OUTLIERS? ' ]
    % STJ = [ ' FIRST ITERATION I USE FILTFIT, HOPE TO USE WAVELET FILTERING IN THE NEAR FUTURE ']
    % rng('default');
    rng('shuffle');
    L = floor(0.95*(2*N-1)); 
    % GEH8 = [ 'JUST USE Ntrn, Ltrn FOR SIGLAGS ' ]
    % ST8 = [ ' THE  CODE BELOW, DOES IT LOOK OK? ']
    for i = 1:100
        n = randn(1,N);
        autocorrn = nncorr( n,n, N-1, 'biased');
        sortabsautocorrn = sort(abs(autocorrn));
        thresh95(i) = sortabsautocorrn(L);
    end
    sigthresh95 = mean(thresh95); % 0.2194 % tested with * (times) 3 to decrease used fractions (outside dotted line)
    % GEH9 = [ ' UH-OH ...THAT"S NOT WHAT I GET ! ' ]
    % STJ9 = [ ' I UNDERSTAND AND AGREE, 0.2194 WAS FROM AN EARLIER SCRIPT ' ]
    % GEH10 = [' ALSO GOOD TO KNOW OTHER STATS ']
    % STJ10 = [ ' ALWAYS GOOD TO GET A FEEL FOR THE DATA, ANY OTHER PURPOSES? PERHAPS PARAMETERS TO STUDY WHEN THE SIZE OF i IS INCREASED/DECREASED? ' ] 
    minthresh95 = min(thresh95); % 0.024638
    medthresh95 = median(thresh95); % 0.027242
    meanthresh95 = mean(thresh95); %0.0273
    stdthresh95 = std(thresh95); %0.0011573
    maxthresh95 = max(thresh95); %0.030326
    % [minthresh95 medthresh95 meanthresh95 stdthresh95 maxthresh95] 
    % GEH11 = [ 'MIGHT WANT TO KNOW WHAT IF max(i) > 100' ]
    % STJ11 = [ ' I HAVE WORKED WITH DIFFERENT SIZES OF i, INCREASING i SHOULD GIVE BETTER STATISTICAL REPRESENTATION OF THE ABOVE PARAMETERS (minthresh95 etc.), HOWEVER, AN INCREASED i CAUSES ONLY MINOR  CHANGES OF ABOVE PARAMETERS' ]
    autocorrt = nncorr(zt,zt,N-1,'biased');
    siglag95 = -1+ find(abs(autocorrt(N:2*N-1))>=sigthresh95);
    % This three lines below decreases the length of siglab95 and FD
    l_sig=length(siglag95);
    r_l_sig=ceil(l_sig/15); % 15 used when three times the length is not used
    siglag95=siglag95(1:r_l_sig);
    FD=siglag95(2:end);
    FD_length=length(FD);
    d=max(FD);
    % siglag95=0:1:156;
    % siglag95 = -1+ find(abs(autocorrt(N:2*N-1))>=0.6); 
    % ### Memory is maxing out if I use all lag values from the above line
    % probably this is one of the most important improvement areas ###
    % GEH12 = [ '1. ONLY NEED A SUFFICIENT SUBSET ' ...
    %                 ' 2. I THINK YOUR INITIAL CALCULATION IS FLAWED ' ]
    % STJ12 = [ ' THE LENGTH OF siglag95 IS DECREASED, HOWEVER, THE MOST SIGNIFICANT WAVELENGTHS ARE STILL TAKEN INTO ACCOUNT, FOR MORE INFO, SEE: ... 
    % 'https://se.mathworks.com/matlabcentral/answers/284253-it-is-not-possible-to-predict-wavelengths-longer-than-the-maximum-index-value-of-the-fd-in-narxnet-a ' ]
    plt = plt+1; figure(plt);
    hold on
    plot(0:N-1, -sigthresh95*ones(1,N),'b--');
    plot(0:N-1, zeros(1,N),'k');
    plot(0:N-1, sigthresh95*ones(1,N),'b--');
    plot(0:N-1, autocorrt(N:2*N-1));
    plot(siglag95,autocorrt(N+siglag95),'ro');
    title('REDUCED NUMBER OF AUTOCORRELATIONS IN SOLAR DATASET');
    % GEH13 = [ ' NOT SURE WHAT THAT MEANS ' ]
    % STJ13 = [ 'FOR THIS VERSION OF THE CODE I HAVE REDUCED THE LENGTH OF ' ...
    % ' SIGLAG95 TO ONLY INCLUDE ENOUGH DATAPOINTS SO THAT THE MOST SIGNIFICANT WAVELENGTH IS ACCONTED FOR ' ]
    % defining the global parameters
    % g_train_method='trainbr';
    % http://se.mathworks.com/help/nnet/ug/choose-a-multilayer-neural-network-training-function.html
    % Why isn't trainbr included in the above page? 
    % BR seems convincing, see http://www.ncbi.nlm.nih.gov/pubmed/19065804 
    g_no_epochs=5000;
    g_trainRatio=74; % 66.67 74 89 50 61 21_22
    g_valRatio=26; % 33.33 26 11 50 39 21_22
    g_testRatio=0;
    g_min_grad=2e-5;
    g_trainmaxfail=100;
    Ntrials=5;
    % (n)-loop or the for(j)-loop? What difference does these three alternatives give? ###
    % GEH14 = [ ' ALL YOU NEED TO KNOW IS WHAT THE RNG STATE IS' ...
    %                ' FOR A DESIGN YOU WANT TO EPRODUCE, SEE MY CODES' ]
    % STJ14 = [ ' I THINK THAT WITH THE APPROACH (FOR AND WHILE LOOPS) I USE IN THIS CODE THE RNG STATE 'shuffle' GIVES THE CORRECT  USAGE OF THE SCRIPT']
      NFD  = length(FD);   % 147
      LDB  = max(FD);      % 152
      Ns   = N-LDB;        % 2747
      Nseq = Ns*O;         % 2747
     % Nw = (NFD*O+1)*H+(H+1)*O % moved inside the first loop below
      Hub     =  -1+ceil( (Nseq-O) / (NFD*O+O+1));  % 18
      Hmax    =  ceil(Hub/10) %  Hmax = 2 ==>  Nseq >>Nw 
    min_nodes=1;
    max_nodes=Hmax; 
    % GEH15 = [ ' JUSTIFICATION FOR THESE NUMBERS ? ... Hub = ?' ]
    % STJ15 = [ ' Hub DEFINED ABOVE' ]
    % STJX1 = [ ' WHAT WOULD JUSTIFY MORE THAN ONE HIDDEN LAYER?  IT THERE ANY RESULT IN THIS SCRIPT THAT COULD SHOW THIS? E.G. NMSE, PREDICTIVE PEERFORMANCE  '...
    %         [ ' IS THIS A VALID REFERENCE FOR THE QUESTION? ftp://ftp.sas.com/pub/neural/FAQ3.html#A_hl ' ]
    % GEH16 = [ ' Ntrials = Number of trials; NOT trails ']
    % STJ16 = [ ' CORRECTED ' ]
    % GEH17 = [ ' Define narnet( FD, n) in outer loop. Ntrials of initial random weights in inner loop '...
    %                  ' but NOT random datadivision for timeseries .. get nonuniform time spacing' ]
    % 
    % STJ17 = [ ' NARNET IS NOW DEFINED IN OUTER LOOP. WITH THIS COMMENT "I did not try randomizing these states here yet, possible future improvement..." I MEANT THAT I WOULD TRY DIFFERENT LEVELS OF TRN/VAL/TST.  ']
    %           'DOES NOT net.divideFcn='divideblock GUARANTEE THAT THE TIME
    %           SPACING IS UNIFORM?'...
    %           'DEFINING NARNET IN THE OUT LOOP (AS IT LOOKS LIKE NOW DOES...
    %           'NOT GIVE RANDOM WEIGHTS THUS A REPETITION OF Ntrials... 
    %           ' WOULD NOT GIVE ANY ADDITIONAL INFORMATION ']
    % GEH18 = ['MORE LATER']
    % STJ18 = [ ' OK ']
    best_Y=[];
    NMSEin=Inf;
    i=0;
    for n=min_nodes:max_nodes
        Nw = (NFD*O+1)*n+(n+1)*O;
        % Nw when n = 1 : 154
        % Nw when n = 2 : 307
        % (HOWEVER, SIZE OF Nw SEEMS TO BE METHOD DEPENDANT)
        i=i+1; % Counter for indices inside the second loop (do not tie up anything to n).
        for j=1:Ntrials
               %%%%start copy 
                net = narnet(FD, n);
     % GEH19 = [ ' USE THE SHORTEST SUBSET AS POSSIBLE ' ] 
     % STJ19 = [ ' SEE COMMENTS ABOVE ON USING THE SIGNIFICANT WAVELENGTH ' ]
            % Setup Division of Data for Training, Validation, Testing
            net.trainFcn = g_train_method;
            net.divideFcn='divideblock'; % I did not try randomizing these states here yet, possible future improvement...
            net.trainParam.epochs = g_no_epochs;
            net.divideParam.trainRatio = g_trainRatio/100;
            net.divideParam.valRatio = g_valRatio/100;
            net.divideParam.testRatio = g_testRatio/100;
            net.trainParam.min_grad=g_min_grad;
            net.trainParam.max_fail=g_trainmaxfail;
            [Xo,Xoi,Aoi,To] = preparets(net,{},{},T);   
            %%%end copy
    % GEH20 = [ 'FOR TIMESERIES PREDICTION TEST SUBSET'... 
    %                 ' MUST OCCUR AFTER THE TRAINING AND VAL SUBSETS']
    % STJ20 = [ ' IS THIS BECAUSE THE TEST DATA SHOULD HAVE AS SIMILAR CHARATERISTICS AS POSSIBLE TO THE PREDICTED DATA? OR ARE THERE OTHER REASONS? ']
    % GEH21 = [ 'WHY 10 ?' ]
    % STJ21 = [ 'TO INVESTIGATE IF ERROR WOULD DECREASE. I'VE ALSO TRIED DIFFERENT LEVELS HERE, UP TO 100' ]
            [ net ,tr, Yo, Eo, Xof, Aof] = train(net,Xo,To,Xoi,Aoi);
            % view(net);
    %        Yo = net(Xo,Xoi,Aoi);
    %        Eo = gsubtract(To,Yo);
    % GEH22 = [ 'COMMENT PREVIOUS TWO STATEMENTS' ]
    % STJ22 = [ 'DONE' ]
            NMSEo_temp(i,j) = mse(Eo)/MSE00;
            epochs = tr.epoch(end);
    % GEH23 = [ ' BETTER TO JUST USE A REASONABLE NUMBER OFDELAYS ' ]
    % STJ23 = [ ' SEE COMMENT STJ13 ' ]
            [n j NMSEo_temp(i,j) mse(Eo) Nw epochs] % Just to show what step is currently being calculated and to show the important parameters
        end
        NMSEo=min(NMSEo_temp(i,:));
    % GEH24 = [' NO! FIND THE BEST DESIGN, NOT THE BEST AVERAGE!']
    % STJ24 = [ ' DONE, SEE ADDED WHILE LOOP BELOW ' ]
    % GEH25 = [ ' HOWEVER DO TABULATE SUMMARY STATS OF' ...
    %                ' MIN, MEDIAN, MEAN, STD AND MAX']
    % STJ25 = [ ' FOR THIS VERSION OF THE CODE I HAVE FOCUSED ON THE MINIMUM ERROR ']
        if (NMSEo<NMSEin)
            fprintf('Best minimmum NMSE so far: %4.6e (for %d Hidden nodes)\n',NMSEo,n);
            NMSEin=NMSEo;
            best_Y=Yo;
            N_use=n;
        end
    end
    % searching for the smallest NMSEo from initial Ntrials
    mean_NMSEo=mean(NMSEo_temp');
    std_NMSEo=std(NMSEo_temp');
    min_NMSEo=min(NMSEo_temp');

RESULTS

I have used the above code and investigated the following trn/val ratios (first number is trn, second is val): 50-50,61-39,74-26,89-11,93.5-6.5. The training method numbeer refers to the following training method (you will also be able to find this in the code for "main"):

% 1:    g_train_method='trainlm';
% 2:    g_train_method='trainbfg';
% 3:    g_train_method='trainrp';
% 4:    g_train_method='trainscg';
% 5:    g_train_method='traincgb';
% 6:    g_train_method='traincgf';
% 7:    g_train_method='traincgp';
% 8:    g_train_method='trainoss';
% 9:    g_train_method='traingdx';

Here is the "winning" NMSEo from the while loop. This NMSEo is selected on the basis that it is smaller (or equal) than the 10 NMSEo (Ntrials) generated in the above for loop.

Then I close the loop and obtain NMSEc, error from closing the loop with no training of the closed loop performed.

Here I have limited the y axis to 0 - 2. As you can see, no values are below 1.

Then I close the loop and also perform training of the closed loop. With this I obtain the following NMSEc2.

Here I have limited the y axis to 0 - 2. NMSEc2 is decreased compared to NMSEc, however, regardless of method I am still far away from any desired value.

Please let me know if you would like me to submit tables of the data instead of figures..(however, I hope the message that I have not obtained a desired low value of NMSEc or NMSEc2 is still clear). In this result section I could show numerous figures that portray the high values of NMSEc or NMSEc2 but I see little meaning is uploading figures with the following appearance:

METHOD - PART 2

So, back to the drawing board. I thought to myself, perhaps the solar dataset cannot be used for prediction. There might be some truth to this, I don't see any repeating pattern in the longer wavelengths (longer wavelengths are round 130 points, for more info see this post: https://se.mathworks.com/matlabcentral/answers/284253-it-is-not-possible-to-predict-wavelengths-longer-than-the-maximum-index-value-of-the-fd-in-narxnet-a). I decided to create a repeating pattern. I used three repetitions of a part of the solar dataset, namely (you can find this text in the function code, it is commented):

t=[t(477:2400) t(477:2400) t(477:2400)];

Start and end of sampling; 477 and 2400 are selected on the basis that the signal should be as continuous as possible. I used filtering and analysis of the diff to determine these values. The content of FD is similar for "part 1" and this "part 2". To not increase computational time too much I decreased Ntrials from 10 to 5. Here is a figure showing the "3*repeated solar dataset", I've added some black rectangles to make it easier to see reoccurring features.

For this analysis I have used a trn/val ratio of 66,67 - 33.33, I thought the model could generate a small NMSEo is the val set were identical to two reoccuring parts of the trn dataset.

RESULTS - PART 2

Here is the NMSEo for the 9 different methods using the "3*repeated solar dataset":

4.1823e-08    1.785e-06   0.00042742    1.031e-05   0.00013799   0.00011407   0.00013801   2.9156e-05   0.00096596

Here is the NMSEc (no training) for the 9 different methods using the "3*repeated solar dataset":

       2.2568       243.24        6.013       28.174       10.056       9.0195        10.83       10.871       5.2844

Here is the NMSEc2 (training is performed) for the 9 different methods using the "3*repeated solar dataset":

       2.2568       13.607       1.0029       27.614       10.056       8.9929        10.83       1.0002       1.0013

DISCUSSION

I could think of a few things to further develop the approach but, at this stage I would be very thankful for receiving a second opinion. I would be very happy if I've made one or a few simple mistakes that could give me a better NMSEc, I really thought that using a repeated signal as I did in part 2 would give me an improved prediction with a small residual error. If there are no simple fixes here, my next step is to investigate using all of the significant FD obtained from nncorr.

3 Comments
Show 1 older commentHide 1 older comment

Staffan on 27 May 2016

Hello Greg,

Newsgroup discussions are ok but can be difficult to follow due to that its not possible to include images and label code. However, Would you like me to submit this content in the newsgroup thread instead?

I've done some further improvements and the current working hypothesis is that it is not sufficient to include the enough data points in FD to account for the most significant wavelength(s), instead all of the significant FD from nncorr needs to be included. Why? Not sure yet. Hope to submit something during the weekend.

Regards

Staffan

Greg Heath on 27 May 2016

You are trying to analyze an unsolved problem, not giving a tutorial lecture on a proven technique.

Therefore, ANSWERS is the correct forum.

I don't like the idea of simultaneously posting in both forums. Rather, once you have a polished technique, post a tutorial in the NEWSGROUP with references to preliminary ANSWERS threads.

Greg

P.S. Interesting problem.

Sign in to comment.

Sign in to answer this question.

Answer 1

Greg Heath on 30 May 2016

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset#answer_223825

Open in MATLAB Online

Here is my preliminary result with default 70/15/15 datadivision and H = 10 hidden nodes. However, NONDEFAULT DIVIDEBLOCK was used to impose TRUE FUTURE PREDICTION WITH CONSTANT TIMESTEPS!

 close all, clear all, clc, plt=0, tic
T = solar_dataset ;
t  = cell2mat(T) ;
[ O N ] = size(t)  % [ 1 2899 ]
 zt = zscore(t,1); tzt = [ t ; zt ];
meantzt = mean( tzt' )' %[ 51.727 ; -9.38e-16 ]
vartzt1   = var( tzt', 1)'  % [1937.3 ; 1 ]
minmaxtzt = minmax(tzt)
% minmaxtzt =      0        253.8
%               -1.1752    4.5911
% Possible outliers
Nout3 = numel(find(abs(zt) > 3))  % 27 ( < 0.93%)
Nout4 = numel(find(abs(zt) > 4))  % 4   ( < 0.14%)
% Keep outliers for now
 Ntst = round(0.15*N), Nval = Ntst % 435, 435
Ntrn = N-Nval-Ntst             % 2029
trnind = 1:Ntrn;               %    1 : 2029
valind = Ntrn+1:Ntrn+Nval;     % 2030 : 2464
tstind = Ntrn+Nval+1:N;        % 2465 : 2899
ttrn = t(trnind); tval = t(valind); ttst = t(tstind);
plt = plt+1, figure(plt), hold on
plot( trnind, ttrn,'b' )
plot( valind, tval,'g' )
plot( tstind, ttst, 'r' )
legend('TRAIN','VAL','TEST')
title( ' SOLAR DATASET TIMESERIES ' )
 rng('default');
Ltrn = floor(0.95*(2*Ntrn-1)) % 3854
imax = 200
for i = 1:imax
    n = randn(1,Ntrn);
    autocorrn        = nncorr( n,n, Ntrn-1, 'biased');
    sortabsautocorrn = sort(abs(autocorrn));
    thresh95(i)      = sortabsautocorrn(Ltrn);
end

For imax = 100 & 200:

 minthresh95  = min(thresh95)   % 0.0293 0.0293
medthresh95  = median(thresh95)% 0.0325 0.0325
meanthresh95 = mean(thresh95)  % 0.0325 0.0325
stdthresh95  = std(thresh95)   % 0.0015 0.0014
maxthresh95  = max(thresh95)   % 0.0355 0.0355
 sigthresh95 = meanthresh95 % 0.0325
 autocorrt = nncorr( zt, zt, Ntrn-1, 'biased' );
siglag95  = -1+ find(abs(autocorrt(Ntrn:2*Ntrn-1)) ...
          >=  sigthresh95);
Nsiglag95 = length(siglag95)       % 1591 ~ 0.78*Ntrn
d         = min(find(diff(siglag95)>1)) % 35
proof     = siglag95(d-1:d+1)        %  33  34  38 
 plt = plt+1; figure(plt);
hold on
plot(0:Ntrn-1, -sigthresh95*ones(1,Ntrn),'b--' );
plot(0:Ntrn-1, zeros(1,Ntrn),'k');
plot(0:Ntrn-1, sigthresh95*ones(1,Ntrn),'b--' );
plot(0:Ntrn-1, autocorrt(Ntrn:2*Ntrn-1));
plot( siglag95, autocorrt(Ntrn + siglag95), 'ro' );
title('SIGNIFICANT AUTOCORRELATIONS VALUES')
 to = t(d+1:N); 
No = N-d               %2864
Notst = round(0.15*No) %430
Noval = Notst          %430
Notrn = No-Noval-Notst %2004
 H = 10                           % Default
neto                 = narnet(1:d,H);
[ Xo, Xoi, Aoi, To ] = preparets(neto,{},{},T);
to                   = cell2mat(To);
[ O No ]             = size(to)    % [ 1 2864 ]
varto1               = var(to,1)   %  1948.6
 [neto tro Yo Eo Xof Aof ] = train(neto,Xo,To,Xoi,Aoi);
view(neto)
% Yo = net(Xo,Xoi,Aoi); Eo = gsubtract(To,To);
NMSEo = mse(Eo)/varto1            % 0.11505
NMSEtrno = tro.best_perf/varto1   % 0.10849
NMSEvalo = tro.best_vperf/varto1  % 0.12139
NMSEtsto = tro.best_tperf/varto1  % 0.13927
 Rsqo    = 1-NMSEo                 % 0.88495
Rsqtrno = 1-NMSEtrno              % 0.89151
Rsqvalo = 1-NMSEvalo              % 0.87861
Rsqtsto = 1-NMSEtsto              % 0.86073
 plotresponse(To,Yo)
totaltime = toc                   % 24.0
 % The next step is to use a double loop over hidden 
% nodes 
%
%   Hmax <= Hub = (Ntrn*O-O)/(d+O+1) % 54
%       
%  and initial RNG states (varies initial weights)
 Hope this helps.

Greg

2 Comments
Show NoneHide None

Greg Heath on 31 May 2016

Edited: Greg Heath on 31 May 2016

Open in MATLAB Online

I have found 4 errors in the previous post:

d = 34
DIVIDEBLOCK WAS NOT IMPLEMETED.
Significant lags were obtain via zt instead of ztrn.
The RNG was not initialized before training

I will replace it ASAP.

Greg

Staffan on 31 May 2016

Edited: Staffan on 31 May 2016

Hello Greg,

I am impressed! I have had too little time to work with the code past few days, now I had a few minutes to spare which gave me the opportunity to load your code into a script and try it out. I can't see how you are able to obtain such a low NMSEo using only 35 delay time steps (I needed a few 100 to obtain reasonable NMSEo's) but I will investigate this as soon as I have the possibility to do so (adding the steps your mention above to the code will not increase NMSEo or require more delay time steps as I see it (this does not diminish the importance of e.g. sticking to using divideblock for these types of challenges)). Once I am on top of the code I will try filtering and a few other techniques to work with the outliers.

Again, truly impressed.

Sincerely

Staffan

Sign in to comment.

Answer 2

Greg Heath on 1 Jun 2016

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset#answer_224085

Edited: Greg Heath on 1 Jun 2016

Open in MATLAB Online

close all, clear all, clc, plt=0, tic

% FOR UNBIASED PREDICTION OF FUTURE SOLAR FLARE EVENTS, 70/15/15 DIVIDEBLOCK DATADIVISION IS USED TO INSURE

1. UNIFORM TIME SAMPLING
2. ONLY TRAINING DATA (THE FIRST 70%) IS EXPLICITLY 
  USED TO CALCULATE NETWORK PARAMETERS
HOWEVER, PERFORMANCE ON VALIDATION DATA (THE FOLLOWING  
15%) IS MONITORED TO STOP TRAINING WHEN GENERALIZATION TO 
GOOD PERFORMANCE ON NONTRAINING DATA BEGINS TO FAIL
 T      = solar_dataset ;    t  = cell2mat(T) ;
[ O N ] = size(t)  % [ 1 2899 ]
 zt        = zscore(t,1);   tzt = [ t ; zt ];
 meantzt   = mean( tzt' )'  %[ 51.727 ; -9.38e-16 ]
 vartzt1   = var( tzt', 1)' % [1937.3 ; 1 ]
 minmaxtzt = minmax(tzt)
 % minmaxtzt =  0       253.8
 %             -1.1752    4.5911
 % Possible outliers
 Nout3 = numel(find(abs(zt) > 3))  % 27 ( < 0.93%)
 Nout4 = numel(find(abs(zt) > 4))  % 4   ( < 0.14%)
 % NOTE: WILL CURRENTLY IGNORE OUTLIER MITIGATION
 Ntst   = round(0.15*N), Nval = Ntst % 435, 435
 Ntrn   = N-Nval-Ntst       % 2029
 trnind = 1:Ntrn;           % 1 : 2029
 valind = Ntrn+1:Ntrn+Nval; % 2030 : 2464
 tstind = Ntrn+Nval+1:N;    % 2465 : 2899
 ttrn = t(trnind); tval = t(valind);ttst = t(tstind);
 plt  = plt+1, figure(plt), hold on
 plot( trnind, ttrn,'b' )
 plot( valind, tval,'g' )
 plot( tstind, ttst, 'r' )
 legend('TRAIN','VAL','TEST')
 title( ' SOLAR DATASET TIMESERIES ' )
 rng('default');
 Ltrn  = floor(0.95*(2*Ntrn-1)) % 3854
 imax = 200
 for i = 1:imax
    n                = randn(1,Ntrn);
    autocorrn        = nncorr( n,n, Ntrn-1, 'biased');
    sortabsautocorrn = sort(abs(autocorrn));
    thresh95(i)      = sortabsautocorrn(Ltrn);
end
                        % imax =  100     200
 minthresh95  = min(thresh95)   % 0.0293  0.0293
 medthresh95  = median(thresh95)% 0.0325  0.0325
 meanthresh95 = mean(thresh95)  % 0.0325  0.0325
 stdthresh95  = std(thresh95)   % 0.0015  0.0014
 maxthresh95  = max(thresh95)   % 0.0355  0.0355
 sigthresh95 = meanthresh95 % 0.0325
 zttrn        = zscore(ttrn',1)';
 autocorrttrn = nncorr( zttrn, zttrn, Ntrn-1, 'biased' );
 siglag95     = find(abs(autocorrttrn(Ntrn+1:2*Ntrn-1)) ...
              >= sigthresh95);                 
 Nsiglag95    = length(siglag95)       % 1327 ~ 0.65*Ntrn
 d            = min(find(diff(siglag95)>1))% 35
 proof        = siglag95(d-1:d+1)          % 34 35 39
 plt = plt+1; figure(plt);  hold on
 plot(0:Ntrn-1, -sigthresh95*ones(1,Ntrn),'b--' );
 plot(0:Ntrn-1, zeros(1,Ntrn),'k');
 plot(0:Ntrn-1, sigthresh95*ones(1,Ntrn),'b--' );
 plot(0:Ntrn-1, autocorrttrn(Ntrn:2*Ntrn-1));
 plot( siglag95, autocorrttrn(Ntrn + siglag95), 'ro' );
 title('SIGNIFICANT AUTOCORRELATIONS VALUES')
 to    = t(d+1:N);
 No    = N-d            % 2864
 Notst = round(0.15*No) % 430
 Noval = Notst          % 430
 Notrn = No-Noval-Notst % 2004
 H  = 10              % Default
 neto                 = narnet(1:d,H);
 neto.divideFcn       = 'divideblock';
 [ Xo, Xoi, Aoi, To ] = preparets(neto,{},{},T);
 to = cell2mat(To);     varto1 = var(to,1)  % 1948.6

% Desire number of unknown weights Nw = (d+1)*H+(H+1)*O to not exceed number of training equations Notrneq = Notrn*O ==> H < = Hub where

 Hub     = (Notrn*O-O)/(d+O+1) % 54.14
 Hmin    = 10, dH = 10, Hmax = 50
 Ntrials = 10
 rng('default')
 j  =  0
 for h = Hmin:dH:Hmax
     j = j+1
    neto.layers{1}.dimensions = h;
    for i = 1:Ntrials
        neto      = configure(neto,Xo,To);
        [neto tro Yo Eo Xof Aof ]=train(neto,Xo,To,Xoi,Aoi);
        %[Yo Xof Aof =net(Xo,Xoi,Aoi);Eo=gsubtract(To,Yo);
        NMSEo(i,j)    = mse(Eo)/varto1;            
        NMSEtrno(i,j) = tro.best_perf/varto1;     
        NMSEvalo(i,j) = tro.best_vperf/varto1;   
        NMSEtsto(i,j) = tro.best_tperf/varto1;    
    end
end
 plotresponse(To,Yo)
 NumH      = Hmin:dH:Hmax
 Rsqo      = 1-NMSEo         
 Rsqtrno   = 1-NMSEtrno       
 Rsqvalo   = 1-NMSEvalo        
 Rsqtsto   = 1-NMSEtsto       
 totaltime = toc 
 %                        Rsqo   
% H=10      20      30      40      50 
% 0.877   0.881*  0.878   0.790   0.775
% 0.880*  0.883*  0.776   0.744   0.794
% 0.858   0.847   0.768   0.650   0.834
% 0.436   0.886*  0.681   0.859   0.762
% 0.879   0.809   0.516   0.803   0.819
% 0.878   0.714   0.883*  0.646   0.880*
% 0.881*  0.881*  0.860   0.841   0.781
% 0.879   0.864   0.810   0.881*  0.778
% 0.887*  0.879   0.641   0.882*  0.549
% 0.875   0.869   0.883*  0.657   0.714
% 
%                       Rsqtrno 
%H=10      20       30      40      50 
% 0.900   0.907   0.915   0.877   0.870
% 0.906   0.906   0.905   0.906   0.948
% 0.901   0.939   0.902   0.892   0.865
% 0.902   0.915   0.933   0.929   0.956
% 0.907   0.871   0.904   0.884   0.902
% 0.905   0.895   0.903   0.968   0.903
% 0.899   0.897   0.929   0.950   0.911
% 0.913   0.928   0.839   0.897   0.953
% 0.916   0.896   0.738   0.900   0.883
% 0.902   0.911   0.929   0.943   0.851
% 
%                   Rsqvalo 
%H=10       20       30      40     50 
% 0.865   0.860    0.833   0.788   0.744
% 0.863   0.860    0.769   0.689   0.627
% 0.835   0.780    0.769   0.742   0.831
% 0.756   0.860    0.780   0.763   0.631
% 0.866   0.811    0.584   0.729   0.748
% 0.864   0.578    0.863   0.553   0.862
% 0.872*  0.875*  0.839   0.771   0.730
% 0.854   0.777    0.797   0.874* 0.770
% 0.855   0.874*   0.553   0.875* 0.687
% 0.865   0.856    0.821   0.730   0.678
% 
%                    Rsqtsto 
%H=10       20      30      40      50 
% 0.785   0.777   0.752   0.388   0.359
% 0.777   0.801*  0.175   0.039   0.243
% 0.679   0.484   0.140  -0.570   0.688
% -2.06    0.778  -0.597   0.627  -0.014
% 0.764   0.510  -1.363   0.497   0.503
% 0.768   0.006   0.806* -0.762   0.786
% 0.807*  0.814*  0.559   0.406   0.223
% 0.745   0.650   0.685   0.814* -0.031
% 0.783   0.805*  0.274   0.805* -1.150
% 0.758   0.689   0.730  -0.752   0.113
% 
%These test Rsquare values are not high enough to obtain 
% a useful closed loop configuration by just closing the loop.
% The next step is to continue training the net after closing 
% the loop and, if unsuccessful, to consider training the CL 
% configuration with a new set of initial random weights.

Hope this helps.

Thank you for formally accepting my answer

Greg

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 3

Greg Heath on 27 May 2016

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset#answer_223589

Edited: Greg Heath on 28 May 2016

Open in MATLAB Online

 For UNBIASED FUTURE PREDICTIONS of the SOLAR_DATASET, I see the 
initial attack structured as follows:
 1. DIVIDEBLOCK DATADIVISION
 2. DEFAULT 0.7/0.15/0.15 TRN/VAL/TST datadivision ratios
 3. TRN subset is known. VAL and TST subsets are unknown.
 4. STATIONARY SUMMARY STATISTICS so that the
    min/median/mean/max/stdv/siglags
    of VAL & TST subsets are ASSUMED to be the same as those of the 
    TRN subset.
 5. Emphasis on estimating a good combination of feedback delays FD = 
    1:d  and number of hidden nodes, H, to obtain Rsqc >= 0.99 
    (i.e., NMSEc <= 0.01 ).
 6. TRAINBR is only considered when the number of training equations, 
    Ntrneq ~ 0.7*Ntrn*O, is not sufficiently larger than the number of 
    unknown weights Nw = (d+1)*H+(H+1)*O. 
 7. Although performance function MSEREG is obsolete and has been 
    replaced by the REGULARIZATION OPTION in MSE, it is STILL 
    AVAILABLE! Unfortunately, there is a BUG in TRAINBR that does not 
    allow the use of a VAL subset. I say unfortunate because recent 
    investigations have shown that the combination can achieve 
    superior results!
 8. My initial investigations have shown that
    a. The summary stats are not stationary. 
    b. There are outliers which may be significantly detrimental.
 9. The next steps are
    a. Estimate the significant lags of the TRN subset
    b. Try to find combinations of d and H to minimize Nw subject to 
    the OL (i.e., OPEN LOOP) performance constraint Rsqo >= 0.999 
    ( i.e., NMSEo <= 0.001 ).
    c. Determine what has to be done to the OL & CL (CLOSELOOP)
    configurations to optimize the CL  performance. If necessary, 
    consider outlier modifications.

Hope this helps.

Thank you for formally accepting my answer

Greg

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 4

Staffan on 11 Aug 2016

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset#answer_231461

Edited: Staffan on 12 Aug 2016

Dear readers,

I worked many hours on the following, the steps below describe how I generated a prediction model that could generate a decent prediction. I’ve chosen to describe the most important features I’ve used with a list, starting with the most significant item. It should also be mentioned that my findings are built on working with the solar dataset, for other datasets different methods might be more applicable. There number in the list below describes a feature in the code, the the after the line of minus signs (--------------) describes what parameters I’ve used for the specific feature.

It is important to select a feedback delay (FD) that can generate convergence in closed loop training. Convergence in this is that the amplitude and the frequency of the prediction vs. the historical data should at least be similar. I’ve discovered that it is NOT sufficient to use a FD that covers the most significant wavelengths ( https://se.mathworks.com/matlabcentral/answers/284253-it-is-not-possible-to-predict-wavelengths-longer-than-the-maximum-index-value-of-the-fd-in-narxnet-a ). I’ve done quite a few investigations using a FD with a smaller number of elements but have not been successful with a converging prediction for such cases. Basically, it does not matter if a large number of neurons are used, if the FD is too short non-convergence will occur. Perhaps (“perhaps” means that I’m not done evaluating this) a good indicator of that enough number of elements has been included in the FD is that MSE continues to decrease in the closed loop training. ----------------------------------------------------------------------------------------------------- In generating the result below a FD with a length of 76 % of siglags has been used. This means that FD will have a length of a bit more than 1000 elements.
The number of nodes is a difficult parameter, perhaps the most difficult parameter to define and understand. I have found that the statistical spread of predictability is larger for re-runs using the same amount of nodes compared to the mean value (15 predictions per x amount of nodes was used) of the predictability for different nodes. I think I see that a very small amount of nodes does not allow for the complexity required to generate a prediction signal that consists of several wavelengths. A too large amount of nodes might allow a prediction signal that is “too” complex. In my early work on this I was expecting a nice 2 degree polynomial showing a minimum error on the amount of nodes that would generate the smallest error, however, this is one example of what I got (X is the number of nodes, Y is an amplitude error): -----------------------------------------So, there is no simple answer to most optimal amount of nodes but with a significant amount of repetitions I have made the conclusion that 6 nodes (again, with a FD of 76 % of the siglags) does at least not give the worst result.
I would like to think that ‘divideint’ should work better in most cases compared to ‘divideblock’. -----------------I’ve used divideint with 50 % training data + 50 % validation data. With this, all one have to be certain of is that the sampling frequency is higher enough to cover the wavelengths that are of interest to predict. If I would go with ‘divideblock’ I don’t think I can say for sure that I have enclosed the interesting repeating pattern, regardless of the sizes of trn and val in ‘divideblock’.
For evaluating the predictability of different methods I don’t use tst data at all (most commonly used is 70 % trn, 15 % val and 15 % tst). I find it more pedagogic to discard the part of the original data I want to perform prediction on at an early stage and then use this data to compare with the generated prediction at a later stage. ---------------------------------------------------------------------------------------------------------------------------------------------- In the calculation I have performed I use the last 10 % meaning that if the total dataset is 3000 data points I use 2700 data points as trn and val data, use the generated model to predict 300 point into the future and compare this data with the real 300 datapoints. Perhaps this is in a sense the same as comparing R2 for different tst data, however, when comparing with unseen data the way I do it I can more easily understand the different between the prediction and the real tst data.
I’ve evaluated all available training methods, both for open and closed training (but ). I can’t use trainlm or trainbr due to the required memory for the large FD I use (also, I’ve found a bug associated to this matter that exists in R2016 that does not exist in R2012). Using different training methods will still generate the same amount of weight so a training method that is fast and generates a small MSE should be OK I guess. Well, I did not look at MSE when selecting the training method but prediction capabilities. -------------------------I’ve found that trainscg preforms well. Also, this method is quite fast. Other methods have a difficult time with the long FD.
I’ve, in all cases trained the closed loop after the open loop. This does not give a significant improvement to the prediction but there is an improvement.
Quantification of the accuracy of predictions ------------------------------------------------------------------------------------------I’ve not only used the distance between the predicted signal and the real tst data as a quantifier, I’ve also looked at how often the sign of the derivative is matching between the predicted signal and the real tst data. The reason for this is that in some cases it might be ok to have over- or undershoot if the frequency and datum content of the two signals is similar.
Rgn setup ------------------------------------------------------------------------------------------------------------------------------------Rgn(‘shuffle’) was used.
This is not related to the code but I’ve worked with GPU acceleration and I haven’t seen any speedup compared to only using the CPU. I think GPU computing are better using with CNN and image processing. More info here: https://se.mathworks.com/matlabcentral/answers/291744-time-series-prediction-using-neural-networks-narnet-narxnet-is-it-at-all-possible-to-train-a-ne

If you have questions on the above, please add the number in the list in your question.

So, to the results. As said, there is a large statistical spread in the predicted, even of the amount of nodes are constant. Consider the following scenario; let’s say I produce 10 predictions (like those seen in the figures below). I might be able to say that some predictions are less representative, even without knowledge of the real tst data. If e.g. the one prediction fluctuates between -2 and +2 and the trn and val data all lie between -1 and +1, then it might be possible that this prediction should be discarded. The same goes for frequency, I might be able to discard prediction based on frequency content that does not match the original (referring to trn and val data) signal. I’ve made 15 predictions, using 6 nodes and 76 % of siglags.

Figure below illustrates a prediction that is a bit small error than the average error calculated based on predicted vs. real tst data (based on the supporting claims in the section above that I might be able to discard some predictions that are “flying through the roof”) than the average of these 15 predictions:

Figure below illustrates the best result of the 15 predictions:

Figure below illustrates the worst result of the 15 predictions:

Please notice the large difference between the best and the worst prediction. The numerical error for the worst prediction is roughly three times larger compared to the best prediction.

In future development I am planning to look further into data division, it would be quite interesting to design other types of data divition setups that are not part of the standard (= existing) selection.ii. If the frequency of a dataset is towards the lower range of would would be desired, don’t you think it could be possible to oversample the signal with a factor of 2 using interp (linear) and then using added data, every second point as validation data (= still using 50 % trn and 50 % val)? By doing this, little of no information should be lost, this could perhaps be regarded as “getting the validation data for free”.

4 Comments
Show 2 older commentsHide 2 older comments

Staffan on 23 Aug 2016

Open in MATLAB Online

Dividetrain --> I’ve put some hours on this; This method does not increase the accuracy of the prediction. I don’t think it is impossible to see improvement using dividetrain, however, I see three detriments: (1) increased likelihood of over fitting, (2) increased computational time and (3) only marginal room for improvement compared to divideint.

Divideint 34/33/33 --> At this early stage I still like produce a prediction that I can visualise. I think that parameters such as NMSE or R2 for tst values are applicable but I think such parameters could be more usable in an optimization (at a later stage, so to say). If I can visualise a prediction in a graph I analyse both the accuracy of wavelengths and amplitudes in one go.

DIVIDE into stationary subsections --> You are here thinking about e.g. using divideblock 44/26/30 and to be sure that the data is (more) stationary at each of these divisions? Regarding stationary data, do you have a suggestion that you think I should read on this subject? I get the concept of that it’s not that easy to perform predictions on data that contains larger trends and I’ve brushed up on Dickey–Fuller test (from wiki, mostly) but I think there is a better reference more applicable to neural networks than the ones I’ve come across.

Difference the series to reduce non-stationarity --> Are you here suggesting that I should perform the NN calculation of the diff of the dataset?...simply :

T = solar_dataset;
t = cell2mat(T); 
t_diff=diff(t);
T_diff=num2cell(t); 
[ O , N ] = size(t); 
   …  and so forth using t_diff and T_diff instead of t and T

Greg Heath on 23 Aug 2016

Open in MATLAB Online

% Dividetrain --> I’ve put some hours on this; This method does not increase the accuracy of the prediction. I don’t think it is impossible to see improvement using dividetrain, however, I see three detriments: (1) increased likelihood of over fitting, (2) increased computational time and (3) only marginal room for improvement compared to divideint.

(1) I have been operating under the edict: For unbiased timeseries prediction

max(indtrn,indval) < min(indtst)

So far, no success. Frustrated, I have cheated and looked at DIVIDETRAIN mainly to find out why everthing is failing. Surprisingly, even this has failed.

Now I'm pretty sure that the basic problems are

   a. Nonstationarity
   b. Insufficient use of significant delays.
   c. I'm on vacation and my wife wants to denerd me for a week.

(2) Overfitting is never a problem for me because I always

a. Compare the number of training equations Ntrneq with the number of 
 unknown weights Nw (Search NEWSREADER and ANSWERS using Ntrneq Nw).
b. Try to minimize the number of hidden nodes subject to a maximum allowed 
 error

(3). So far training time has been no problem at all.

(4). With default delays , nothing works for me. I think skipping the estimation of significant delays is my biggest mistake.

% Divideint 34/33/33 --> At this early stage I still like produce a prediction that I can visualise. I think that parameters such as NMSE or R2 for tst values are applicable but I think such parameters could be more usable in an optimization (at a later stage, so to say). If I can visualise a prediction in a graph I analyse both the accuracy of wavelengths and amplitudes in one go.

% DIVIDE into stationary subsections --> You are here thinking about e.g. % using divideblock 44/26/30 and to be sure that the data is (more) stationary at each of these divisions? Regarding stationary data, do you have a suggestion that you think I should read on this subject? I get the concept of that it’s not that easy to perform predictions on data that contains larger trends and I’ve brushed up on Dickey–Fuller test (from wiki, mostly) but I think there is a better reference more applicable to neural networks than the ones I’ve come across.

My experience with trend removal has been limited to polynomials and the equivalent of replacing variables with differences.

% Difference the series to reduce non-stationarity --> Are you here suggesting that I should perform the NN calculation of the diff of the dataset?...simply : % % T = solar_dataset; % t = cell2mat(T); % t_diff=diff(t); % T_diff=num2cell(t); % … and so forth using t_diff and T_diff instead of t and T

Yes. This is standard procedure. In fact even higher order diferences are used (differences remove linear trends/2nd differences remove quadratic trends, etc).

Frustrating problem.

More later

Greg

Shashank Bhatia on 12 Aug 2018

I think this post has most from Greg and Staffan. Thank you both for this deeper discussion into the matter on which Mathworks have failed to give appropriate examples.

Staffan it would be best if you can post your final code for this example.

Greg, I request you to please add if you have any further developments on this example.

I am trying to compile all the relevant information regarding NARNET from various threads by Greg to make 1 post which covers the most.

Thanks once again.

Sincerely, Shashank

Sign in to comment.

Answer 5

Saira AlZadjali on 14 Oct 2018

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset#answer_341334

Open in MATLAB Online

data.xlsx

Dear Greg,

i am working on almost same, predicting wind speed using narnet it have been long time, did lots of changes in network parameters, but all giving almost same accuracy or worse. so read more in deep, in ur answers here. and got to know that we need to do autocorrelation to get the FD. thus i followed the method above for my dataset. but i am getting d value very large :( can you help to find suitable delay and hidden layer neurons, i am getting open loop MSE = range (0.34xx to 0.38xx) for all tries which i did with different Delay and Hidden neurons, with different training function, and algorithms) finally i came to do this autocorrelation method, but here again stuck, d is "8445" which is very huge number.. please advise //

    data_1= xlsread('G:\SQU2\Solar_Wind_Project\MAtlabCodes\data');
T = data_1 (1:79146,2); 
T=T';
t  = T ;
[ O N ] = size(t)  % [ 1 79146 ]
 zt        = zscore(t,1);   tzt = [ t ; zt ];
 meantzt   = mean( tzt' )'  %[ 6.2702 ; 0.0000 ]
 vartzt1   = var( tzt', 1)' % [10.4444 ; 1.0000 ]
 minmaxtzt = minmax(tzt)
 % minmaxtzt =  0.2100   18.1810
 %  -1.8752    3.6855
 % Possible outliers
Nout3 = numel(find(abs(zt) > 2))  % 3305 
Nout4 = numel(find(abs(zt) > 3))  % 221   
% NOTE: WILL CURRENTLY IGNORE OUTLIER MITIGATION
Ntst   = round(0.15*N), Nval = Ntst % 11872, 11872
Ntrn   = N-Nval-Ntst       % 55402
trnind = 1:Ntrn  ;         % 1 : 55402
valind = Ntrn+1:Ntrn+Nval; % 55403 : 67274
tstind = Ntrn+Nval+1:N;    % 67275 : 79146
ttrn = t(trnind); tval = t(valind);ttst = t(tstind);
 plt  = plt+1, figure(plt), hold on
 plot( trnind, ttrn,'b' )
 plot( valind, tval,'g' )
 plot( tstind, ttst, 'r' )
 legend('TRAIN','VAL','TEST')
 title( ' WIND SPEED DATASET TIMESERIES ' )
 rng('default');
 Ltrn  = floor(0.95*(2*Ntrn-1)) % 105262
imax = 100
 for i = 1:imax
    n                = randn(1,Ntrn);
    autocorrn        = nncorr( n,n, Ntrn-1, 'biased');
    sortabsautocorrn = sort(abs(autocorrn));
    thresh95(i)      = sortabsautocorrn(Ltrn);
end
  % imax =  100     
 minthresh95  = min(thresh95)   %  0.0061  
 medthresh95  = median(thresh95)% 0.0062  
 meanthresh95 = mean(thresh95)  % 0.0062  
 stdthresh95  = std(thresh95)   % 5.1053e-05  
 maxthresh95  = max(thresh95)   % 0.0064 
 sigthresh95 = meanthresh95 %   0.0062
 zttrn        = zscore(ttrn',1)';
 autocorrttrn = nncorr( zttrn, zttrn, Ntrn-1, 'biased' );
 siglag95     = find(abs(autocorrttrn(Ntrn+1:2*Ntrn-1)) ...
              >= sigthresh95);                 
 Nsiglag95    = length(siglag95)       % 52291 ~ 0.9438*Ntrn
 d            = min(find(diff(siglag95)>1))% 8445
 proof        = siglag95(d-1:d+1)          % 8444 8445 8470
plt = plt+1; figure(plt);  hold on
 plot(0:Ntrn-1, -sigthresh95*ones(1,Ntrn),'b--' );
 plot(0:Ntrn-1, zeros(1,Ntrn),'k');
 plot(0:Ntrn-1, sigthresh95*ones(1,Ntrn),'b--' );
 plot(0:Ntrn-1, autocorrttrn(Ntrn:2*Ntrn-1));
 plot( siglag95, autocorrttrn(Ntrn + siglag95), 'ro' );
 title('SIGNIFICANT AUTOCORRELATIONS VALUES')
 to    = t(d+1:N);
 No    = N-d            % 70701
 Notst = round(0.15*No) % 10605
 Noval = Notst          % 10605
 Notrn = No-Noval-Notst % 49491

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

SUGGESTIONS FOR IMPROVEMENT(S) ON NARNET MULTISTEP AHEAD PREDICTIONS ON THE SOLAR DATASET?

3 Comments
Show 1 older commentHide 1 older comment

Accepted Answer

2 Comments
Show NoneHide None

More Answers (4)

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

4 Comments
Show 2 older commentsHide 2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

SUGGESTIONS FOR IMPROVEMENT(S) ON NARNET MULTISTEP AHEAD PREDICTIONS ON THE SOLAR DATASET?

3 Comments Show 1 older commentHide 1 older comment

Accepted Answer

2 Comments Show NoneHide None

More Answers (4)

0 Comments Show -2 older commentsHide -2 older comments

0 Comments Show -2 older commentsHide -2 older comments

4 Comments Show 2 older commentsHide 2 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

3 Comments
Show 1 older commentHide 1 older comment

2 Comments
Show NoneHide None

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

4 Comments
Show 2 older commentsHide 2 older comments

0 Comments
Show -2 older commentsHide -2 older comments