Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Text extraction from images - Filling outer edges only!

Subject: Text extraction from images - Filling outer edges only!

From: Mariam

Date: 21 Sep, 2011 12:31:29

Message: 1 of 6

Hey everyone.
I have a canny edge image (with thickened edges to have continuous edges) of some text. Now the edge image has some extra open edges from noise in the background but after I use imfill with 'holes' I have only the text filled. But I would rather like to fill only the outer edges of the characters to be later fed into an OCR engine, for example an 'O' with a filled interior is often wrongly perceived as an '@'. How do i only fill the outer edges of the letter 'O' or 'a' or 'p' and so on?

Any help would be highly appreciated.
I will try to upload an image to make things clearer but since I am new here I don't really know how to do that yet :)

Thanks in advance!

Subject: Text extraction from images - Filling outer edges only!

From: ImageAnalyst

Date: 21 Sep, 2011 12:35:57

Message: 2 of 6

On Sep 21, 8:31 am, "Mariam " <mariam.has...@eu.sony.com> wrote:
> Hey everyone.
> I have a canny edge image (with thickened edges to have continuous edges) of some text. Now the edge image has some extra open edges from noise in the background but after I use imfill with 'holes' I have only the text filled. But I would rather like to fill only the outer edges of the characters to be later fed into an OCR engine, for example an 'O' with a filled interior is often wrongly perceived as an '@'. How do i only fill the outer edges of the letter 'O' or 'a' or 'p' and so on?
>
> Any help would be highly appreciated.
> I will try to upload an image to make things clearer but since I am new here I don't really know how to do that yet :)
>
> Thanks in advance!

-------------------------------------------------------------------------
Where did you post your example images?

Subject: Text extraction from images - Filling outer edges only!

From: Mariam

Date: 21 Sep, 2011 13:08:27

Message: 3 of 6

ImageAnalyst <imageanalyst@mailinator.com> wrote in message <05c59496-6d3d-4d38-9aff-c83cf6fcaa87@hb5g2000vbb.googlegroups.com>...
> On Sep 21, 8:31 am, "Mariam " <mariam.has...@eu.sony.com> wrote:
> > Hey everyone.
> > I have a canny edge image (with thickened edges to have continuous edges) of some text. Now the edge image has some extra open edges from noise in the background but after I use imfill with 'holes' I have only the text filled. But I would rather like to fill only the outer edges of the characters to be later fed into an OCR engine, for example an 'O' with a filled interior is often wrongly perceived as an '@'. How do i only fill the outer edges of the letter 'O' or 'a' or 'p' and so on?
> >
> > Any help would be highly appreciated.
> > I will try to upload an image to make things clearer but since I am new here I don't really know how to do that yet :)
> >
> > Thanks in advance!
>
> -------------------------------------------------------------------------
> Where did you post your example images?

I posted it here: http://imageshack.us/photo/my-images/833/theproblem.jpg/

I hope you can help me out :)

Subject: Text extraction from images - Filling outer edges only!

From: Mariam

Date: 21 Sep, 2011 13:14:28

Message: 4 of 6

"Mariam " <mariam.hassib@eu.sony.com> wrote in message <j5cnkb$rl7$1@newscl01ah.mathworks.com>...
> ImageAnalyst <imageanalyst@mailinator.com> wrote in message <05c59496-6d3d-4d38-9aff-c83cf6fcaa87@hb5g2000vbb.googlegroups.com>...
> > On Sep 21, 8:31 am, "Mariam " <mariam.has...@eu.sony.com> wrote:
> > > Hey everyone.
> > > I have a canny edge image (with thickened edges to have continuous edges) of some text. Now the edge image has some extra open edges from noise in the background but after I use imfill with 'holes' I have only the text filled. But I would rather like to fill only the outer edges of the characters to be later fed into an OCR engine, for example an 'O' with a filled interior is often wrongly perceived as an '@'. How do i only fill the outer edges of the letter 'O' or 'a' or 'p' and so on?
> > >
> > > Any help would be highly appreciated.
> > > I will try to upload an image to make things clearer but since I am new here I don't really know how to do that yet :)
> > >
> > > Thanks in advance!
> >
> > -------------------------------------------------------------------------
> > Where did you post your example images?
>
> I posted it here: http://imageshack.us/photo/my-images/833/theproblem.jpg/
>
> I hope you can help me out :)

To make things clearer too, this is my code:

 BW = edge((gray),'canny',0.6);
 BW=imdilate(BW, se); %To fill any holes in the edges (image 1 in my example)
 BWfill = imfill(BW, 'holes'); % Image 2
 holes = BWfill &~ BW; %image 3
    

Subject: Text extraction from images - Filling outer edges only!

From: Abbas Cheddad

Date: 21 Sep, 2011 13:27:10

Message: 5 of 6

You probably need to pre-process the input letters before feeding them into the OCR.
Hopefully, you will get a better solution then what I am proposing.

Use the inverse of the first image using imcomplement and then use bwlabel to seperate objects (you don't need in this case to use filling).

Abbas

Subject: Text extraction from images - Filling outer edges only!

From: Florin Neacsu

Date: 21 Sep, 2011 18:01:28

Message: 6 of 6

"Mariam " <mariam.hassib@eu.sony.com> wrote in message <j5clf1$isa$1@newscl01ah.mathworks.com>...
> Hey everyone.
> I have a canny edge image (with thickened edges to have continuous edges) of some text. Now the edge image has some extra open edges from noise in the background but after I use imfill with 'holes' I have only the text filled. But I would rather like to fill only the outer edges of the characters to be later fed into an OCR engine, for example an 'O' with a filled interior is often wrongly perceived as an '@'. How do i only fill the outer edges of the letter 'O' or 'a' or 'p' and so on?
>
> Any help would be highly appreciated.
> I will try to upload an image to make things clearer but since I am new here I don't really know how to do that yet :)
>
> Thanks in advance!

Hi,

You could just eliminate the inside part of your Os and As. Create a metric to estimate discs. You can do that by labeling connected elements, compute area and surface and than do a weighted ratio of that. A filled circle will have a ration close to 1, while an empty circle (or any other shape) will be closer to 0.3 or less. It's up to you to decide on the threshold.
This way, you could eliminate the "inside" of "O"s, "d"s, "g"s and even "a"s.

HTH. Regards,
Florin

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us