How to create a dataset as input to a neural network for a character recognition system?

8 views (last 30 days)
I'm breaking my head on neural networks. I've got the general idea but I am unable to work it out with the required tools in matlab (nntool,nprtool). The thing is, for training the network, the input has to be 250 images of each kannada character and the target vector for each character should be such that it is able to indicate which character it is... maybe 1-first character, 2-second character and son on...
Does the following make any sense? :) Should the input have a size of 250x75x75? where each image is of size 75x75 and there are 250 samples of each character. How do I do this? How should the target vector look? Maybe the first place of the target vector should have 1 and the rest all zeros for the first character. Similarly, let it be a column matrix having 1 again in the second place and the rest zeros for the second character. Is this right? I have no clue. Also, am I right in saying that the input and target vectors need to be matlab variables (with .mat extension) for it to be able to fed into the network accordingly? otherwise I will be unable to load the workspace. So, do I have to write another script separately to read the input and target variables?
A basic question - For training the network, the input needs to be normalized right? Does this mean that each image has to go through the previous steps of binarization and resizing before being fed into the neural network for training?
I don't know if I'm making things complex but I am unable to move forward. Please give me a hint :( I've exhausted all resources or maybe I'm not searching efficiently...
Thank You.

Accepted Answer

Greg Heath
Greg Heath on 7 Jul 2012
Edited: Walter Roberson on 9 Jul 2012
>How to create a dataset as input to a neural network for a character
>recognition system?
>Asked by Monisha on 4 Jul 2012 at 15:11
>I'm breaking my head on neural networks. I've got the general idea but I
>am unable to work it out with the required tools in matlab (nntool,nprtool).
>The thing is, for training the network, the input has to be 250 images of
>each kannada character and the target vector for each character should be
>such that it is able to indicate which character it is... maybe 1-first
>character, 2-second character and son on...
>Does the following make any sense? :) Should the input have a size of
>250x75x75? where each image is of size 75x75 and there are 250 samples of
>each character. How do I do this?
1. Check the literature to see if there is a common method of feature extraction for the purpose of input variable reduction for kannada characters that will substantially reduce the image size (say 25 X 25 or smaller) or just represent the image by a column vector of extracted features.
2.Each reduced image should be converted to a column vector of length 625 = 25*25 using the (:) operator.
3. The resulting size of the input matrix is 625 X 250 which can be partitioned into training, validation and test subsets.
4. The corresponding character indices ( 1 to ?) can be converted to columns of the unit matrix using the function ind2vec.
>How should the target vector look? Maybe the first place of the target
>vector should have 1 and the rest all zeros for the first character.
>Similarly, let it be a column matrix having 1 again in the second place
>and the rest zeros for the second character. Is this right? I have no
>clue.
Correct
> Also, am I right in saying that the input and target vectors need to be
>matlab variables (with .mat extension) for it to be able to fed into the
>network accordingly?
No. Text and Microsoft XL files can be read by ML.
>otherwise I will be unable to load the workspace. So, do I have to write
>another script separately to read the input and target variables?
Not necessarily.
>A basic question - For training the network, the input needs to be
>normalized right? Does this mean that each image has to go through the
>previous steps of binarization and resizing before being fed into the
>neural network for training?
Highly recommended.
>I don't know if I'm making things complex but I am unable to move forward.
>Please give me a hint :( I've exhausted all resources or maybe I'm not
>searching efficiently...
Did the exhausted resources include a Google search of kannada character recognition?
Hope this helps.
Greg
  4 Comments
Monisha
Monisha on 9 Jul 2012
I'm sorry I didn't quite understand your solution. The thing is I am training each kannada character at one time with a target vector so how does my input have a size of [64 250*c] where c is the no. of characters? I'd appreciate the help. Thanks!
Greg Heath
Greg Heath on 1 Aug 2012
You have to train with all characters at once. By default the training algorithm will randomly choose 70% for training and 15% each for validation and training.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!