How to detect text region from a document image?

1 view (last 30 days)
Leung
Leung on 14 Nov 2014
Commented: Leung on 17 Nov 2014
I have a document image, which might be a newspaper or magazine. For example, a scanned newspaper. I want to remove all/most text and keep images in the document. Anyone know how to detect text region in the document? Below is an example. Thanks in advance!

Answers (2)

Dima Lisin
Dima Lisin on 15 Nov 2014
Try using the ocr function in the Computer Vision System Toolbox.
  2 Comments
Leung
Leung on 16 Nov 2014
It needs the computer vision toolbox and it only supports English language. I want to detect text region other than english words.
Any other ideas? Thanks!
Dima Lisin
Dima Lisin on 16 Nov 2014
Actually, ocr supports many languages. But it does require the Computer Vision System Toolbox.

Sign in to comment.


Image Analyst
Image Analyst on 16 Nov 2014
If you don't want to use the Computer Vision System Toolbox, see this: http://www.visionbib.com/bibliography/contentschar.html#OCR,%20Document%20Analysis%20and%20Character%20Recognition%20Systems for a bunch of algorithms that can handle it in many languages. You'd have to write the code for those papers - we don't have any code for any of them.
  1 Comment
Leung
Leung on 17 Nov 2014
Thanks! I will take a look at it, are there other simple methods to do this? As I don't need to know the content of the text, I only need to know the location of the text region. :_)

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!