4.0

4.0 | 2 ratings Rate this file 128 downloads (last 30 days) File Size: 4.13 KB File ID: #17839

PDF Reader

by Tom Gaudette

 

30 Nov 2007 (Updated 03 Dec 2007)

Code covered by BSD License  

Read in the text from PDF files

Download Now | Watch this File

File Information
Description

This code will read in the text on a PDF file and place it into a MATLAB variable. It places each page from the PDF into a cell so that you can later remove headers/footers.

You will need the professional version of Acrobat Reader because it uses the COM Server interface to the reader.

2 Files:

readPDF - Reads in the text and returns a cell array.

Findmatchtext - Tries to find the header/footer and return a struct with the header/footer removed from the pages.

MATLAB release MATLAB 7.5 (R2007b)
Other requirements Adobe Acrobat Reader full version http://acrobat.softwarecenterz.com/
Zip File Content  
Other Files readPDF.m,
findmatchtext.m
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (4)
09 Jan 2008 s lee

works fine with english text

06 Apr 2008 Saad Mansoor

When I try to run it in MATLAB it gives the following error,

??? No appropriate method or public field Add for class COM.AcroExch_HiliteList.

Error in ==> readPDF at 48
hilite.Add(0,wordsPerPage); %capture 1000 words off page

Can you please tell me what is the problem?

26 May 2008 Saad Mansoor

Now its working for me. After I read the file I wan't to change the name of the file to some text extracted from the file. This MATLAB doesnot do and gives error that the file is being used by another process. What should I do?

13 Jul 2008 pino ang

doesn't work...

pdDoc=actxserver('AcroExch.PDDoc');

filename = 'sdarticle.pdf';

pdDoc.Open(filename);

??? Error using ==> open
Too many output arguments.

------
any solutions?

Please login to add a comment or rating.
Tag Activity for this File
Tag Applied By Date/Time
data import Tom Gaudette 22 Oct 2008 09:37:29
data export Tom Gaudette 22 Oct 2008 09:37:29
pdf reader adobe acrobat Tom Gaudette 22 Oct 2008 09:37:29
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com