I am currently developing the following method:
Automated OCR Text extraction from technical drawings
and after checking the literature, I would like to ask you for feedback my current process, especially in the segmentation of elements in the drawing.
My current process is as follows:
- Open the image [imread()]
- Convert it to a binary image [rg2gray()->imbinarize()]
- Segment/cluster the image to define Regions Of Interest (Title Block - usually lower right, x# Part projections - middle, frame - around the drawing) [I tried superpixels() but it seems to be insufficient]
- Get rid of the frame [no clue how to]
- Run OCR on the Title Block [ocr()] and look for specific text strings
- Run OCR on the X Y Z Projections [ocr()] and look for dimensions etc..
- Store the data in a predefined xls [xlswrite()]
The task seems to be quite easy, as techincal drawings should follow standards and images are usually black&white, however it looks like my approach is insufficient, especially in detecting shapes of the Title Block and quite irregular parts.
Do you think it might be worth exploring functions like fill holes or region fill or it will be better to create a heatmap to segment the image?