Setting up Inputs and Targets for neural networks

I need a little advice with setting up a neural network. I have looked through all the information on matlab help files and have used simple networks for other problems. However I have a spreadsheet with 500,000 rows and 20 columns. I have imported the data in and started to set up a bp network. From this I have set up 20 values (1 for each column) that identify a class from the the 500,000 rows. This class of data represents 240,000 rows of the total dataset . Each column of data should represent an input to the neural network. So the network should be something like 20-10-1 (20 inputs, 10 hidden neurons and 1 output) . ideally the output should assign a value of 12 if the row of data is identified to belonging to the class of data im trying to identify. Ive set up some networks but Im struggling to set up the inputs and outputs correctly. I either get 500,000 inputs on the view net topology or if I transpose the rows and columns I get 20 inputs but it errors with a conflict of outputs. I just need a little bit of clarification on how to set up this network in matlab?
Thanks
RIKW

 Accepted Answer

For N pairs of of I-dimensional inputs and corresponding O-dimensional outputs
[ I N ] = size(input) % [ 20 5e5 ]
[ O N ] = size(target) % [ 1 5e5 ]
%target = 1 for desired class
%target = 0 otherwise
1. You can probably reduce both N and I.
2. With this much data you can afford to use equal sized train, val and test subsets: Ntst ~ Nval ~ Ntrain ~ N/3
3.The number of training equations is Neq = Ntrn*O
4. You will probably have to find the number of hidden nodes, H and a practical value for Ntrn by trial and error.
5. For a net with I-H-O node topology, the number of estimated weights is Nw = (I+1)*H+(H+1)*O
6. To mitigate noise and measurement error, it is desirable to have an overdetermined system: Neq >> Nw which,
a. given Ntrn, yields an upperbound for H : Hub << (Neq-O)/(I+O+1)
b. given H, yields a lower bound for Ntrn : Ntrn >> Nw/O
c. for the MATLAB default of H = 10, Nw = 230+11=241 and Ntrn ~ 5000 should be more than sufficient.
Hope this helps.
Greg

More Answers (0)

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!