Selection of input data set for artificial neural network

2 views (last 30 days)
Hi. I am new in using ANN. My objective is to use it as a surrogate of the actual model. I am confused about: 1. How to select the input point set for ANN? 2. How to decide the number of input samples needed to train ANN for a particular number of input variables? I will really appreciate any help in this regard. Thanks!

Accepted Answer

Greg Heath
Greg Heath on 13 Jan 2015
Edited: Greg Heath on 13 Jan 2015
The general regression scenario is you want a deterministic multivariate function that yields an O-dimensional output vector given an I-dimensional input vector. Although the function is deterministic, it works well for nondeterministic inputs and outputs with moderate to high signal-to-noise ratios.
Typically, the O-dimensional output function is approximated by a weighted sum of H tanh functions whose arguments are linear combinations of the I-dimensional input vector components. The node topology of the net is I-H-O.
The weights are approximated, iteratively, with N pairs of input/output-target examples
[ I N ] = size(input)
[ O N ] = size(target)
The number of scalar equations obtained from the examples are
Neq = N*O
The number of unknown weights is
Nw = (I+1)*H+(H+1)*O
The 1s indicate bias weights connected to constant unit source.
To mitigate noise, interference and measurement errors, it is desired that
Neq >> Nw
and the approximation is obtained by minimizing the sum of the squared differences between the approximation output and the design target.
The double inequality is equivalent to H << Hub (upper bound) where
Hub = -1 + ceil( (N*O-O) / (I+O+1))
There are a myriad of details and exceptions that you can worry about after you have tried the examples in the documentation
help fitnet
doc fitnet
for regression and
help patternnet
doc patternnet
for classification.
To practice with more data try
help nndatasets
doc nndatasets
Hope this helps.
Greg
  1 Comment
tanmoy
tanmoy on 13 Jan 2015
Edited: tanmoy on 13 Jan 2015
Thanks for the reply! On what basis the input sample set should be determined? Say we have n input parameters and we are taking p samples to train ANN. How this (n x p) input data set is constructed? For example I know in some of the surrogate modeling tecniques like kriging, it can be determined using latin hypercube sampling. What sampling method should I use to construct the (n x p) matrix of inputs for ANN?

Sign in to comment.

More Answers (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!