Documentation Center

  • Trial Software
  • Product Updates

ocrText class

Object for storing OCR results

Description

The ocr function returns the ocrText object. The object contains the recognized text and metadata collected during optical character recognition (OCR). You can access the information with the ocrText object properties. You can also locate text that matches a specific pattern with the object's locateText method.

Code Generation Support
Compile-time constant input: No restrictions.
Supports MATLAB® Function block: No
Code Generation Support, Usage Notes, and Limitations

Properties

expand all

Text — Text recognized by OCRarray of characters

Text recognized by OCR, specified as an array of characters. The text includes white space and new line characters.

CharacterBoundingBoxes — Bounding box locationsM-by-4 matrix

Bounding box locations, stored as an M-by-4 matrix. Each row of the matrix contains a four-element vector, [x y width height]. The [x y] elements correspond to the upper-left corner of the bounding box. The [width height] elements correspond to the size of the rectangular region in pixels. The bounding boxes enclose text found in an image using the ocr function. Bounding boxes width and height that correspond to new line characters are set to zero. Character modifiers found in languages, such as Hindi, Tamil, and Bangalese, are also contained in a zero width and height bounding box.

CharacterConfidences — Character recognition confidencearray

Character recognition confidence, specified as an array. The confidence values are in the range [0, 1]. A confidence value, set by the ocr function, should be interpreted as a probability. The ocr function sets confidence values for spaces between words and sets new line characters to NaN. Spaces and new line characters are not explicitly recognized during OCR. You can use the confidence values to identify the location of misclassified text within the image by eliminating characters with low confidence.

Words — Recognized wordscell array

Recognized words, specified as a cell array.

WordBoundingBoxes — Bounding box location and sizeM-by-4 matrix

Bounding box location and size, stored as an M-by-4 matrix. Each row of the matrix contains a four-element vector, [x y width height], that specifies the upper left corner and size of a rectangular region in pixels.

WordConfidences — Recognition confidencearray

Recognition confidence, specified as an array. The confidence values are in the range [0, 1]. A confidence value, set by the ocr function, should be interpreted as a probability. The ocr function sets confidence values for spaces between words and sets new line characters to NaN. Spaces and new line characters are not explicitly recognized during OCR. You can use the confidence values to identify the location of misclassified text within the image by eliminating words with low confidence.

Methods

locateTextLocate string pattern

Examples

expand all

Find and Highlight Text in an Image

     businessCard = imread('businessCard.png');
     ocrResults   = ocr(businessCard);
     bboxes = locateText(ocrResults, 'MathWorks', 'IgnoreCase', true);
     Iocr   = insertShape(businessCard, 'FilledRectangle', bboxes);
     figure; imshow(Iocr);

Find Text Using Regular Expressions

     businessCard = imread('businessCard.png');
     ocrResults   = ocr(businessCard);
     bboxes = locateText(ocrResults, 'www.*com','UseRegexp', true);
     img    = insertShape(businessCard, 'FilledRectangle', bboxes);
     figure; imshow(img);

See Also

| | |

Was this topic helpful?