How to extract data from pdf file in matlab?
Show older comments
I am in search of such algorithm that will extract data from pdf file.For example in the pdf file a sentence is present i.e: Account# 29 . I want to extract 29 from pdf file.If it is possible by fopen() function ,please share it with me.I have tried pdftotext but doesn't succeed. Now if it is possible to extract data from pdf with the help of fopen(), it will be better.I also tried fopen() but leads to failure.Please share you experience with me..Thanks.
6 Comments
fopen() will not automagically read pdf data.
As a philosophical aside, pdf's are not meant to be edited. They are intended to be read only. I am afraid whatever method you can come up with will be something of a kludge.
Technically, it should be possible to make sense of a pdf file with fread. In practice, you would need to interpret the pdf format, and that is a tall order since a pdf is not really text but an image file. That doesn't stop tools like pdftotext to attempt it, but you'll probably get mixed results, depending on what's in your file.
azizullah khan
on 19 Sep 2014
José-Luis
on 19 Sep 2014
No, I would be very surprised if you could do it like that. Opening a binary file, which is what fopen() would get you, is a long way from actually transforming that file into text.
There is no easy way to extract text from pdfs.
azizullah khan
on 19 Sep 2014
José-Luis
on 19 Sep 2014
Yes, I have seen it and it doesn't work. In principle, it might work for trivial purposes like changing the font type, but I have no idea what kind of data you are trying to extract.
Writing a robust algorithm is a tall order.
azizullah khan
on 20 Sep 2014
Edited: azizullah khan
on 20 Sep 2014
Accepted Answer
More Answers (2)
mizuki
on 25 Apr 2018
1 vote
Walter Roberson
on 25 May 2015
0 votes
Have you looked at http://www.mathworks.com/matlabcentral/answers/151092-how-to-read-pdf-file-in-matlab ?
Categories
Find more on Characters and Strings in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!