pull out number after specific string in txt file

65 views (last 30 days)
Im struggling to find a suitable way to pull a specific value from a txt file.
I need the number after the wording "image quality score" , so in the text below, it would be the number 33.44468. Theres are many 1000 lines of text and about 100 lines containing this image quality score number.
14-12-15 14:33:18.962 +00 000000060 INF: Camera 1 Reg1 1 (E41G51_1) tile 45603 image quality score (red image): 33.44468
I have used the following, but not sure how to then extract an array of numbers that I then want to plot.
wordString='image quality score';
A=fileread(file);
B=strfind(A, wordString)
thanks for any help Jason
  3 Comments
Jason
Jason on 16 Dec 2014
I've got this approach nearly to work, not sure its the most elegant way:
fid = fopen(file,'r');
while 1
tline = fgetl(fid);
if strfind(tline, 'image quality score')>0
U=strfind(tline, 'image quality score')
tline(U:end)
sprintf('%s %f',tline)
end
if ~ischar(tline)
break
end
end
Jason
Jason on 16 Dec 2014
Thanks Adam. ow do I extract last floating number??

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 16 Dec 2014
As Adam said, a regular expression is probably the easiest (as long as you understand the regular expression language that is). There are many ways to construct the regular expression, depending on the allowed formatting for the number, etc. This will probably work for you:
filetext = fileread('somefile.txt');
numbers = str2double(regexp(filetext, '(?<=image quality score[^0-9]*)[0-9]*\.?[0-9]+', 'match'))
This will extract the first number after each of the 'image quality score' string. A number being defined as one or more digit (0 to 9) with an optional dot somewhere in there (but not at the end).
Complete explanation of the regex:
  • look ahead for (i.e. start matching after) the string 'image quality score' followed by any number of characters not including '0' to '9', the (?<=image quality score[^0-9]*)
  • once the preceding is found, match 0 or more characters from '0' to '9', the [0-9]*
  • followed by an optional ., the \.?
  • followed by one or '0' to '9' character, the [0-9]+
  4 Comments
Guillaume
Guillaume on 16 Dec 2014
You can only accept one answer, but you can vote for the answer by clicking on the grey triangle
Bret Kenny
Bret Kenny on 11 Jul 2017
Hi Guillaume, I wondering if it is possible to match for negative numbers as well in this case? I am searching for xyz location coordinates in a text file and require the sign of the number. Appreciate any help you can offer.

Sign in to comment.

More Answers (1)

Adam
Adam on 16 Dec 2014
Edited: Adam on 16 Dec 2014
If all lines containing image score have the same format you could use textscan on the tline that you extract (or infact just tline(U:end) as you have it) with the correct format specifiers to extract the tokens and then take the last one extracted.
Maybe easier, again if the format is consistent is:
tokens = strsplit( tline(U:end), ':' );
score = str2double( tokens{end} );
It isn't an elegant general solution, but you don't need a general solution if all cases have the same format.

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!