How to scan pdf documents in a folder for string and give out file name

2 views (last 30 days)
Hi.
I'm working on a code that scans a folder for a string and gives out the file name of the document. But my code doesn't quite work yet. I would be really greatful if someone could check it for me and suggest something. Would it be possible to also give out the page number, not just the file names?
filepath = input('Please specify the file path to scan as a string:\n'
files = dir(fullfile(filepath,'*.pdf'));
search = input('\nYour search term (as string):\n');
Filenames = {};
n = 1;
for k = 1:numel(files)
fid = fopen(fullfile(filepath,files(k).name),'r');
fgetl(fid); % skip first row
fullstr = fscanf(fid,'%s'); % NOT SURE IF THIS IS CORRECT
fclose(fid);
found = ismember(fullstr,search);
found = find(found);
%This part is supposed to find out if all characters are next to each other and not just spread around the document randomly
ct = 0; %counter
for i = 2:length(found)
if found(i) == found(i-1)+1
ct = ct + 1;
end
end
%If all characters are next to each other, the counter should be the same length as "found"
if ct == length(found)
Filenames{n,1} = files(k).name;
n = n+1;
end
end
n = 2;
if length(Filenames) == 0
Filenames{1,1} = 'Could not find any files.';
end
%Create string containing file names:
FilenamesSTR = [];
for k = 1:length(Filenames)
FilenamesSTR = [FilenamesSTR, Filenames{k,1}, '\n' ];
end
fprintf(FilenamesSTR)

Accepted Answer

Walter Roberson
Walter Roberson on 11 Jan 2014
strfind()
And I suspect you want to replace
fullstr = fscanf(fid,'%s');
with
fullstr = fread(fid);

More Answers (0)

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!