Clear Filters
Clear Filters

Character extraction/segmentation in an image

2 views (last 30 days)
Now,I know this question has been asked plenty of times,but I am working on the IAM Bern dataset which has a set of wonderful images.
For example as shown below
Now the process is extract lines and then word and then finally character,as of now I have looked at this example,
but this one cannot generalize to new images.Moreover for the words shown in the below image,as we can see there is not much space,so it would be nice to know if there are some methods that can handle this.
There is also this method in python but again,it suffers from oversegmentation
Note,currently I am relying on regionprops and manuall cropping the words,and although the results are good,I would like to know if there are any other existing methods that handle cursive characters.
Also,I am aware I can use tesseract and more powerful deep learning frameworks like CRNN and other stuff,but unfortunately in my environment we prefer traditional methods.
  1 Comment
Image Analyst
Image Analyst on 28 Aug 2021
"we can see there is not much space,so it would be nice to know if there are some methods that can handle this." <== Try the padarray() function to add space around a matrix.
For the word cropping you asked about. I would use imclose() with a mostly horizontal structuring element to connect the letters, then use regionprops() to find the bounding box of the words, then use imcrop():
mask = grayImage < 128;
se = true(2, 7);
mask2 = imclose(mask, se);
props = regionprops(mask2, 'BoundingBox');
% Crop each word
for k = 1 : length(props)
subImage = imcrop(grayImage, props(k).BoundingBox);
end

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!