Extraction of text from document
6 views (last 30 days)
Show older comments
i want to extract the keywords from the document in order to find the term frequency so can u help me by providing the code
Answers (2)
Walter Roberson
on 17 Jan 2016
There is no universally defined set of keywords. You will need to define more clearly what needs to be extracted from the document, and you will need to describe what "document" means to you.
0 Comments
Image Analyst
on 17 Jan 2016
Have you done OCR yet, so that you have a list of strings in a cell array? If so, I think you could construct a histogram of word frequency using uinque() and ismember(). I don't have a demo, try it yourself.
You might also like allwords to split a big long string of many, many words up into a cell array of individual words, which you can then use with unique() and ismember().
0 Comments
See Also
Categories
Find more on Data Type Conversion in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!