How to train a Neural Network with an input data set that comprises of numeric values as well as nominal variables (such as the base fluid used which could be either 'Water', 'Oil' or 'Slickwater')?

116 views (last 30 days)
Kushagra Kakar on 18 Feb 2014
Commented: Greg Heath on 21 Feb 2014
Hi, I am trying to model a production data set using neural network. I am using a static 2 -layer feedforward neural network for the same (10 neurons hidden layer, 1 neuron output, dividerand, trainlm, transfer function-logsigmoid for hidden and linear for output).
My input data set comprises of 125 production well. Each well has 9 variables, 6 of which are numeric whereas the rest 3 are nominal variable (like orientation of well which could be north, south, east or west....base fluid used, which could be either of water, oil or slickwater).
When I load this dataset into matlab for training of neural network. It shows error that the input data should either be 'numeric' or 'logical'.
So, firstly is there any way I could train my network simultaneously using a combination of numeric and nominal variables. Secondly, if not is there any way I could rationally code numerical variables into their corresponding numeric values.
Greg Heath on 19 Feb 2014
Well, what were you putting in? Example?
Kushagra Kakar on 20 Feb 2014
I imported the whole data set including non numeric nominal variables as a cell array and tried to train the network. But it gave error that the Input{7,1} should either be numeric or logical. When I checked in the data set the particular cell was assigned to string 'Water'. I hope it helps.

Ahmed on 19 Feb 2014
Try to code your nominal variables as dummy binary variables, then input that into your neural network.
nomvar = nominal(randi(3,10,1));
dumvar = dummyvar(nomvar),
Kushagra Kakar on 20 Feb 2014
Could you please elaborate a bit more on how would I use the 'dummyvar' to replace the nominal variables in the imported data set...and one more thing how should I import my data set, as a Matrix, Cell Array etc. ??
Thanks,KK

Iain on 19 Feb 2014
You could make your non-numeric values into numeric ones by using enumerations.
E.g.
Oil = 1;
Water = 2;
Slickwater = 3;
Thebloodoftheinnocent = 4;
Fluid = Oil;
Also, you could replace the north/south etc, with compass headings. 0 = North, pi = south, etc.
3 CommentsShow 1 older commentHide 1 older comment
Kushagra Kakar on 20 Feb 2014
Hi Lain, if I use these enumerations then I would imply that the difference in production due to water and oil (2-1) is same as the difference in production due to Slickwater and Water (3-2)....which is not true, right?
Greg Heath on 21 Feb 2014
No. You only assume a numerical order.
Training will find the correct weights.
For example, if you used 1,2,3,10 you should get the same answer because during training, the net will learn to decrease the input weights for variable four by a factor of 2.5