Asked by Teck Kong Chong
on 13 Sep 2016

Dear all, I am trying to find ID, FD and hidden nodes for my Narx network with multiple input. In this case I am using the PH Data set which has 2 inputs and one output. I have search pretty much the entire ANSWER and NEWSROOM to see how this can be done but it appears that the questions and answers are around single input data set such as SIMPLENARX data set. I have modified the code posted earlier at: https://au.mathworks.com/matlabcentral/answers/296646-finding-optimal-id-fd-and-hidden-nodes-for-narxnet

my codes are as follow:

close all, clear all, clc,

tic

plt=0;

%[X,T] = PH_dataset;

[X,T] = ph_dataset;

x = cell2mat(X);

t = cell2mat(T);

[ I N ] = size(x); % [ 2 2001 ]

[ O N ] = size(t); % [ 1 2001 ]

% Define data for training

Ntrn = N-2*round(0.15*N) % 0.7/0.15/0.15 trn/val/tst ratios

trnind = 1:Ntrn;

Ttrn = T(trnind);

Ntrneq = prod(size(Ttrn)) % 1401

MSE00 = var(t',1) % 6.86

% Calculate Z-Score for input (x) and target (t), split input series and

% calculate z-Score separately

x1 = x(1,:);

x2 = x(2,:);

zx1 = zscore(x1,1);

zx2 = zscore(x2,1);

zt = zscore(t, 1);

zx1trn = zscore(x1(trnind), 1);

zx2trn = zscore(x2(trnind), 1);

zttrn = zscore(t(trnind), 1);

% Plot Input & Output for both original and transformed (Z-scored)

plt = plt+1,figure(plt);

subplot(221)

plot(x1)

hold on

plot(x2)

title('PH DATASET INPUT SERIES')

subplot(222)

plot(zx1)

hold on

plot(zx2)

title('STANDARDIZED INPUT SERIES')

subplot(223)

plot(t)

title('PH DATASET OUTPUT SERIES')

subplot(224)

plot(zt)

title('STANDARDIZED OUTPUT SERIES')

rng('default')

L = floor(0.95*(2*N-1)) % 189

for i = 1:1000 % Number of repetations to use for estimating summary statistics

% This is for Target (T) Autocorrelation

n = zscore(randn(1,N),1);

autocorrn = nncorr( n,n, N-1, 'biased');

sortabsautocorrn = sort(abs(autocorrn));

thresh95T(i) = sortabsautocorrn(L);

% This is for Input-Target (IT) Cross-correlation

nx = zscore(randn(1,N),1);

nt = zscore(randn(1,N),1);

autocorrnIT = nncorr( nx,nt, N-1, 'biased');

sortabsautocorrnIT = sort(abs(autocorrnIT));

thresh95IT(i) = sortabsautocorrnIT(L);

end

% For Target Autocorrelation

sigTthresh95 = median(thresh95T)% 0.0327

meanTthresh95 = mean(thresh95T) % 0.0327

minTthresh95 = min(thresh95T) % 0.0291

medTthresh95 = median(thresh95T) % 0.0327

stdTthresh95 = std(thresh95T) % 0.0011

maxTthresh95 = max(thresh95T) % 0.0358

% For Input-Target Autocorrelation

sigITthresh95 = median(thresh95IT)% 0.0326

meanITthresh95 = mean(thresh95IT) % 0.0327

mintIThresh95 = min(thresh95IT) % 0.0286

medtIThresh95 = median(thresh95IT) % 0.0326

stdtIThresh95 = std(thresh95IT) % 9.1674e-4

maxtIThresh95 = max(thresh95IT) % 0.0351

%%CORRELATIONS

%%%%%TARGET AUTOCORRELATION %%%%%%%

%

autocorrt = nncorr(zttrn,zttrn,Ntrn-1,'biased');

sigflag95 = -1+ find(abs(autocorrt(Ntrn:2*Ntrn-1))>=sigTthresh95);

sigflag95(sigflag95==0)=[]; % Remove 0 from FD

%

plt = plt+1, figure(plt);

hold on

plot(0:Ntrn-1, -sigTthresh95*ones(1,Ntrn),'b--')

plot(0:Ntrn-1, zeros(1,Ntrn),'k')

plot(0:Ntrn-1, sigTthresh95*ones(1,Ntrn),'b--')

plot(0:Ntrn-1, autocorrt(Ntrn:2*Ntrn-1))

plot(sigflag95,autocorrt(Ntrn+sigflag95),'ro')

title('SIGNIFICANT TARGET AUTOCORRELATIONS (FD)')

%

%%%%%%INPUT-TARGET CROSSCORRELATION %%%%%%

% Data Row 1

crosscorrxt_zx1 = nncorr(zx1trn,zttrn,Ntrn-1,'biased');

sigilag95_x1 = -1 + find(abs(crosscorrxt_zx1(Ntrn:2*Ntrn-1))>=sigITthresh95)

%

plt = plt+1, figure(plt);

hold on

plot(0:Ntrn-1, -sigITthresh95*ones(1,Ntrn),'b--')

plot(0:Ntrn-1, zeros(1,Ntrn),'k')

plot(0:Ntrn-1, sigITthresh95*ones(1,Ntrn),'b--')

plot(0:Ntrn-1, crosscorrxt_zx1(Ntrn:2*Ntrn-1))

plot(sigilag95_x1,crosscorrxt_zx1(Ntrn+sigilag95_x1),'ro')

title('SIGNIFICANT INPUT(X1)-TARGET CROSSCORRELATIONS (ID)')

% Data Row 2

crosscorrxt_zx2 = nncorr(zx2trn,zttrn,Ntrn-1,'biased');

sigilag95_x2 = -1 + find(abs(crosscorrxt_zx2(Ntrn:2*Ntrn-1))>=sigITthresh95)

%

plt = plt+1, figure(plt);

hold on

plot(0:Ntrn-1, -sigITthresh95*ones(1,Ntrn),'b--')

plot(0:Ntrn-1, zeros(1,Ntrn),'k')

plot(0:Ntrn-1, sigITthresh95*ones(1,Ntrn),'b--')

plot(0:Ntrn-1, crosscorrxt_zx2(Ntrn:2*Ntrn-1))

plot(sigilag95_x2,crosscorrxt_zx2(Ntrn+sigilag95_x2),'ro')

title('SIGNIFICANT INPUT(X2)-TARGET CROSSCORRELATIONS (ID)')

My first question is, am I do it right in finding the ID and FD for multi-input? IF I have 10 input then I will have to do the similar things for 10 times? Are there a better way of doing this? Would greatly appreciate any comments!

Secondly, there are so many statistically significant for the data, how do I select what to include?There are a few hundreds of them (for e.g X1 and X2 has more than 500 significant)! I don't quite understand how to do, any advice would be much appreciated.

After finding the statistically significant, I then select the minimal subset of delay and use it to calculate the size of the hidden nodes. I remember Greg said that if more than 1 input, some equations need to be modified and I am pretty sure this is the reason why my results are poor but I am not sure which equation need to be modified, even if so, I may not understand why I need to change it since the numbers of data point for both inputs are the same and that the NID is fixed for both input, so are NFD, O, I etc. The subsequent codes as below:

ID = 1:26; % I have chosen 26 and 20 based on the smallest subset of ID/FD calculated above.

FD = 1:20;

NID = length(ID); % 26

NFD = length(FD); % 21

LDB = max([ID,FD]) % Length of the delay buffer = 26

Hub = (Ntrneq-O)/(NID*I+NFD*O+O+1) % 18.67

Hmax = floor(Hub); % 18

dH =2;

Hmin = 0;

Ntrials = 10;

rng('default')

trainFcn = 'trainlm';

j=0;

for h = Hmin:dH:Hmax

j=j+1

if h==0

neto = narxnet(ID,FD,[],'open',trainFcn);

Nw = (NID*I+NFD*O+1)*O;

else

neto = narxnet(ID,FD,h);

Nw = (NID*I+NFD*O+1)*h+(h+1)*O;

end

Ndof = Ntrneq-Nw

neto.divideFcn = 'divideblock';

neto.performFcn = 'mse'; % This is default since 'msereg' is obsolete.

[Xo Xoi Aoi To] = preparets(neto,X,{},T);

to = cell2mat(To);

MSE00o = var(to,1)

MSE00oa = var(to,0)

MSEgoal = 0.005*max(Ndof,0)*MSE00oa/Ntrneq

MinGrad = MSEgoal/100

neto.trainParam.goal = MSEgoal;

neto.trainParam.min_grad = MinGrad;

for i= 1:Ntrials

% Save state of RNG for duplication

s(i) = rng;

neto = configure(neto,Xo,To);

[neto tro Yo Eo Xof Aof ] = train(neto,Xo, To, Xoi, Aoi);

% Eo = gsubtract(To,Yo);

stopcrit{i,j} = tro.stop;

R2o(i,j) = 1 - mse(Eo)/MSE00o;

end

end

result = [ (Hmin:dH:Hmax); R2o ]

% stopcrit = stopcrit

elapsedtime = toc %

Thank you very much!

Answer by Greg Heath
on 16 Sep 2016

Accepted Answer

> My first question is, am I do it right in finding the ID and FD for multi-input?

In general yes. However there are corrections to be made.

>IF I have 10 input then I will have to do the similar things for 10 times?

You don't HAVE to do anything. However, since the MATLAB documentation is useless w.r.t. advice, I've come up with this idea which makes sense to me (and I hope, others).

>> Are there a better way of doing this? Would greatly appreciate any commments!

I have heard of partial correlations where correlations are chosen taking into account the correlations that have been chosen before them.

I have not pursued this approach due to time constraints AND the success I have had with OL solutions. (According to my recent post in the NEWSGROUP, I am still confused and frustrated without a straightforward CL solution).

>> Secondly, there are so many statistically significant for the data, how do I select what to include? There are a few hundreds of them (for e.g X1 and X2 has more than 500 significant)! I don't quite understand how to do, any advice would be much appreciated.

I try to use the smallest subset that will yield a sufficiently small NMSE. So far, it is just a trial and error process.

> After finding the statistically significant, I then select the minimal subset of delay and use it to calculate the size of the hidden nodes. I remember Greg said that if more than 1 input, some equations need to be modified and I am pretty sure this is the reason why my results are poor but I am not sure which equation need to be modified,

The ones that quickly come to mind are VAR and ZSCORE. Maybe NNCORR ...

> even if so, I may not understand why I need to change it since the numbers of data point for both inputs are the same and that the NID is any references for both input, so are NFD, O, I etc.

I agree. But don't forget, this is just something that I came up with because I could not find a relevant reference.

Later I became aware of partial correlations. However, I haven't had the time. If you have nothing to do the next few weeks you might want to pursue that (;>}.

More later,

Greg

Greg Heath
on 18 Sep 2016

Teck Kong Chong
on 19 Sep 2016

Hi Greg, Thank you so much for pointing out things that I don't quite understand, I really appreciated your effort in this! I will be posting my further questions in regards to this in the NEWSGROUP (if I have any). Thanks!

Teck

Greg Heath
on 19 Sep 2016

For unbiased prediction:

1. Use Ntrn for significant correlation thresholds

2. Nrep = 200 is suffient for sigthresh95

3. Since based on random inputs, use same threshold for auto and cross

4. Can include 0 lag in ID (BUT NOT FD !)

5 Replace s(i) with s(i,j)

6. Include numepochs(i,j)= tro.num_epochs;

7. Tabulate 100*NMSE (per cent error) instead of Rsq

HTH,

Greg

Sign in to comment.

Opportunities for recent engineering grads.

Apply Today
## 5 Comments

## Greg Heath (view profile)

Direct link to this comment:https://www.mathworks.com/matlabcentral/answers/302908-narxnet-with-multi-input#comment_391115

## Greg Heath (view profile)

Direct link to this comment:https://www.mathworks.com/matlabcentral/answers/302908-narxnet-with-multi-input#comment_391171

## Greg Heath (view profile)

Direct link to this comment:https://www.mathworks.com/matlabcentral/answers/302908-narxnet-with-multi-input#comment_391191

## Greg Heath (view profile)

Direct link to this comment:https://www.mathworks.com/matlabcentral/answers/302908-narxnet-with-multi-input#comment_391276

## Greg Heath (view profile)

Direct link to this comment:https://www.mathworks.com/matlabcentral/answers/302908-narxnet-with-multi-input#comment_391401

Sign in to comment.