sscanf to extract numbers from string

9 views (last 30 days)
Hi everyone,
I have a data file like this:
1 .00 80.00 160.00 240.00 320.00 400.00 480.00 560.00 640.00 720.00
1 -250.00-250.00-250.00-255.00-255.00-260.00-265.00-270.00-265.60-260.00
1 800.00 880.00 960.001040.001120.001200.001280.001360.001440.001520.00
1 -255.00-255.00-263.30-286.70-310.60-320.00-313.90-290.00-267.80-260.00
....
The format doens't change.
I try to read each line using sscanf, but when the numbers doenst have space, I can't read them.
In otherwords, when I use:
ff = fgetl(fid)
aff = sscanf(ff,'%f')
This works fine for the first line, because the numbers has space between them.
But doens't work for the rest of lines.
I also tried the command:
ff = fgetl(fid)
aff = sscanf(ff,'%2f %7.2f%7.2f%7.2f%7.2f%7.2f%7.2f%7.2f%7.2f%7.2f%7.2f',[1 11])
But without success.
Someone can help me?
Best regards.

Accepted Answer

Walter Roberson
Walter Roberson on 28 Mar 2019
textscan() and fscanf() and sscanf() all have the same problem: Their counts for formats such as %7f start only after leading space has been skipped. You can see this in particular in the space 960.00 no-space 1040.11 entries on the third line: the count starts after the space, so the 960.001 is what gets parsed.
To handle fixed-width inputs, you have a small number of choices:
  1. There is a File Exchange contribution to handle fixed-width input
  2. if you bash your head against the problem for long enough you can convince textscan() to work fixed width; this is not easy
  3. you can use array indexing to break the text into fields that you convert to numeric such as with str2double
  4. you can use regexp to break the text into fields that you convert to numeric such as with str2double
  5. with R2017a or later, you can use readtable() in a fixed-width mode. See https://www.mathworks.com/help/matlab/ref/matlab.io.text.fixedwidthimportoptions.html
regexp() can be pretty useful for a purpose such as this.
  4 Comments
Bruno Goncalves
Bruno Goncalves on 4 Apr 2019
Error using reshape
Product of known dimensions, 7, not divisible into total number of elements,
72.
Walter Roberson
Walter Roberson on 4 Apr 2019
Use fgetl() rather than fgets(), and use 4:end instead of 3:end .
Your code used one more character for the second entry on the line (first floating point entry), but there is no reason to expect that the extra character is used; it is more likely that there is just an extra blank in the format at that point.

Sign in to comment.

More Answers (1)

Guillaume
Guillaume on 28 Mar 2019
Edited: Guillaume on 28 Mar 2019
sscanf format is different from sprintf. In particular there's no .2 notation for %f, so your .2 is interpreted as a literal .2 and of course does not match anything.
There's no need to fgetl and then sscanf. You can read the whole file in one go with fscanf instead, so:
fid = fopen(somefile, 'rt');
assert(fid > 0, 'Failed to open file');
aff = fscanf(fid, '%2f %7f%7f%7f%7f%7f%7f%7f%7f%7f%7f', [11, Inf])'
fclose(fid);
  2 Comments
Walter Roberson
Walter Roberson on 28 Mar 2019
This turns out to fail on the space 960.00 no-space 1040.11 pair on the third line.
Bruno Goncalves
Bruno Goncalves on 28 Mar 2019
Edited: Stephen23 on 28 Mar 2019
Hi Guilaume,
Thank you, but I have to apologize.
The format changes.
I attached a file containing the correct format.
I have to read line by line because, depending on the value you are reading, it will or not will change another variable.
In a first situation, I used the code bellow to read the values.
This works fine when I have space between numbers.
But, doenst work when they are attached each others, as that case.
filename='v2.txt';
fid=fopen(filename,'rt');
ppcntr=300;
ppvel=300;
nrzmax=ppcntr/10;
nrvmax=ppvel/10;
icnt=1;
player=12; % numero maximo de camadas
ncont=1; % numero de camadas atual
ntotal = fscanf(fopen(filename,'rt'),'%f',Inf);
ntotal = ntotal(end-3)-1;
zmaximo = fscanf(fopen(filename,'rt'),'%f',Inf);
zmaximo = zmaximo(end);
% velocity model
for icont=1:ntotal
nrz=1;
while ( icnt == 1 )
if ( nrz > nrzmax ); break; end
line1 = fgets(fid);
line2 = fgets(fid);
line3 = fgets(fid);
tmp1 = sscanf(line1,'%f');
tmp2 = sscanf(line2,'%f');
tmp3 = sscanf(line3,'%f');
ntmp1 = length(tmp1);
ntmp2 = length(tmp2);
ntmp3 = length(tmp3);
if ( nrz == 1 )
j1=1; j2=ntmp1-1;
elseif ( nrz > 1 )
j1=j2+1; j2=j2+(ntmp1-1);
end
ilyr = fix(tmp1(1));
xm(icont,j1:j2) = tmp1(2:ntmp1);
icnt = fix(tmp2(1));
zm(icont,j1:j2) = tmp2(2:ntmp2);
ivarz(icont,j1:j2) = tmp3(1:ntmp3);
nrz = nrz + 1;
xm_pts(icont) = j2;
zm_pts(icont) = j2;
ivarz_pts(icont) = j2;
if ( icnt == 0 ); break ; end
%disp("parei..."); pause
end
% sprintf('Interface %d primeiro grupo lido',icont)
% pause
nrv = 1;
icnt = 1;
while ( icnt == 1 )
if ( nrv > nrvmax ); break ; end
line1 = fgets(fid);
line2 = fgets(fid);
line3 = fgets(fid);
tmp1 = sscanf(line1,'%f');
tmp2 = sscanf(line2,'%f');
tmp3 = sscanf(line3,'%f');
ntmp1 = length(tmp1); ntmp2 = length(tmp2); ntmp3 = length(tmp3);
if ( nrv == 1 )
j1=1; j2=ntmp1-1;
elseif ( nrv > 1 )
j1=j2+1; j2=j2+(ntmp1-1);
end
ilyr = fix(tmp1(1));
xvel(icont,1,j1:j2) = tmp1(2:ntmp1);
icnt = fix(tmp2(1));
vf(icont,1,j1:j2) = tmp2(2:ntmp2);
ivarv(icont,1,j1:j2) = tmp3(1:ntmp3);
nrv = nrv + 1;
xvel_pts(icont,1) = j2;
vf_pts(icont,1) = j2;
ivarv_pts(icont,1) = j2;
if ( icnt == 0 ); break ; end
end
% sprintf('Interface %d segundo grupo lido',icont)
% pause
nrv = 1;
icnt = 1;
while ( icnt == 1 )
if ( nrv > nrvmax ); break ; end
line1 = fgets(fid);
line2 = fgets(fid);
line3 = fgets(fid);
tmp1 = sscanf(line1,'%f');
tmp2 = sscanf(line2,'%f');
tmp3 = sscanf(line3,'%f');
ntmp1 = length(tmp1); ntmp2 = length(tmp2); ntmp3 = length(tmp3);
if ( nrv == 1 )
j1=1; j2=ntmp1-1;
elseif ( nrv > 1 )
j1=j2+1; j2=j2+(ntmp1-1);
end
ilyr = fix(tmp1(1));
xvel(icont,2,j1:j2) = tmp1(2:ntmp1);
icnt = fix(tmp2(1));
vf(icont,2,j1:j2) = tmp2(2:ntmp2);
ivarv(icont,2,j1:j2) = tmp3(1:ntmp3);
nrv = nrv + 1;
xvel_pts(icont,2) = j2;
vf_pts(icont,2) = j2;
ivarv_pts(icont,2) = j2;
if ( icnt == 0 ); break ; end
end
% sprintf('Interface %d terceiro grupo lido',icont)
% pause
ncont=ncont+1;
icnt=1;
% disp('Tecle enter ler a interface ' + icont)
% pause
end
sprintf('Total interfaces lidas %d ',ncont-1)
disp('Fim da leitura...')
fclose(fid)

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!