Contents

ocrText class

Object for storing OCR results

Description

ocrText contains recognized text and metadata collected during optical character recognition (OCR). The ocr function returns the ocrText object. You can access the information contained in the object with the ocrText properties. You can also locate text that matches a specific pattern with the object's locateText method.

Code Generation Support
Compile-time constant input: No restrictions.
Supports MATLAB® Function block: No
Code Generation Support, Usage Notes, and Limitations

Properties

expand all

TextText recognized by OCRarray of characters

Text recognized by OCR, specified as an array of characters. The text includes white space and new line characters.

CharacterBoundingBoxesBounding box locationsM-by-4 matrix

Bounding box locations, stored as an M-by-4 matrix. Each row of the matrix contains a four-element vector, [x y width height]. The [x y] elements correspond to the upper-left corner of the bounding box. The [width height] elements correspond to the size of the rectangular region in pixels. The bounding boxes enclose text found in an image using the ocr function. Bounding boxes width and height that correspond to new line characters are set to zero. Character modifiers found in languages, such as Hindi, Tamil, and Bangalese, are also contained in a zero width and height bounding box.

CharacterConfidencesCharacter recognition confidencearray

Character recognition confidence, specified as an array. The confidence values are in the range [0, 1]. A confidence value, set by the ocr function, should be interpreted as a probability. The ocr function sets confidence values for spaces between words and sets new line characters to NaN. Spaces and new line characters are not explicitly recognized during OCR. You can use the confidence values to identify the location of misclassified text within the image by eliminating characters with low confidence.

WordsRecognized wordscell array

Recognized words, specified as a cell array.

WordBoundingBoxesBounding box location and sizeM-by-4 matrix

Bounding box location and size, stored as an M-by-4 matrix. Each row of the matrix contains a four-element vector, [x y width height], that specifies the upper left corner and size of a rectangular region in pixels.

WordConfidencesRecognition confidencearray

Recognition confidence, specified as an array. The confidence values are in the range [0, 1]. A confidence value, set by the ocr function, should be interpreted as a probability. The ocr function sets confidence values for spaces between words and sets new line characters to NaN. Spaces and new line characters are not explicitly recognized during OCR. You can use the confidence values to identify the location of misclassified text within the image by eliminating words with low confidence.

Methods

locateTextLocate string pattern

Examples

expand all

Find and Highlight Text in an Image

businessCard = imread('businessCard.png');
ocrResults = ocr(businessCard);
bboxes = locateText(ocrResults, 'MathWorks', 'IgnoreCase', true);
Iocr = insertShape(businessCard, 'FilledRectangle', bboxes);
figure; imshow(Iocr);

Find Text Using Regular Expressions

     businessCard = imread('businessCard.png');
     ocrResults   = ocr(businessCard);
     bboxes = locateText(ocrResults, 'www.*com','UseRegexp', true);
     img    = insertShape(businessCard, 'FilledRectangle', bboxes);
     figure; imshow(img);

See Also

| | |

Was this topic helpful?