Read row x to row y in a textfile

Hi,
I have a very long textfile that I want to read the lines from 32781 to 32894, excluding the lines above and below, and store the lines in a new text files. I have made this code, but it does not work.
The txt-file have some lines with two columns and other with three and four. The lines I am interested in only have two columns.
[T1,PSA1]=textread('FFC_M7_1.txt', '%f %f','headerlines',32781);
[T,PSA] = [T1(32782:32894), PSA1(32782:32894)]
f1=fopen('FFC_M7_1_test.txt','w');
for i= 1:length(T)
fprintf(f1,'%11.4f %11.4f\n',[T(i) PSA(i)]);
end
fclose(f1);
This message appear when I try to run the code:
Error using dataread
Trouble reading floating point number from file (row 114, field 1) ==> Frequency (Hz) A
Error in textread (line 174)
[varargout{1:nlhs}]=dataread('file',varargin{:}); %#ok<REMFF1>
Error in WritePSA (line 1)
[T1,PSA1]=textread('FFC_M7_1.txt', '%f %f','headerlines',32781);
Does anyone know how to do this?
Thanks, Marthe

 Accepted Answer

Cedric
Cedric on 27 Oct 2017
Edited: Cedric on 27 Oct 2017
Is the following working?
[T1,PSA1]=textread('FFC_M7_1.txt', '%f %f %*s %*s','headerlines',32781);
or
content = fileread( 'FFC_M7_1_test.txt' ) ;
lineStarts = [0, strfind( content, sprintf('\n') )] + 1 ;
block = content(lineStarts(32782) : (lineStarts(32895)-1)) ;
data = reshape( str2double( regexp( block, '[\d\.\-]+', 'match' )), 2, [] ).' ;
I cannot test right now though.

14 Comments

The first is not working, I get the same message as before.
The second give "Unexpected matlab expression" in
block = content(lineStarts(32782) : (lineStarts(:32895)-1)) ;
Cedric
Cedric on 27 Oct 2017
Edited: Cedric on 27 Oct 2017
See my edited answer (without the :). It seems that your data block doesn't have only two columns but also text occasionally, and that's probably why the first approach doesn't work.
The second approach is less sensitive to that as it extracts only the numbers. Yet if you have numbers in extra text it will fail (the reshape). In such case, you should extract the block that you are interested in, and attach it to your question (as a text file) so we can have a look at how to deal with the content properly.
Statement is incomplete for
data = reshape( str2double( regexp( block, '[\d\.\-]+', 'match' )), 2, []
I have attached the part of the file (included one headline) that I want to extract.
I am back on a MATLAB. I tested the code above with an updated version of your file (attached) where I added random content after and before a block that I am interested in, going from line 73 to line 186:
content = fileread( 'PSA2.txt' ) ;
lineStarts = [0, strfind( content, sprintf('\n') )] + 1 ;
block = content(lineStarts(73) : (lineStarts(186)-1)) ;
data = reshape( str2double( regexp( block, '[\d\.\-]+', 'match' )), 2, [] ).' ;
and it works. Array data contains the data that you sent me. If the full data file is not too confidential, feel free to email it to me at matlab@elitemail.org, and I can have a look.
Unfortunately it did not work on the full txt-file. I have email you the full file.
Here an approach based on pattern matching to first find the relevant block(s) and then parse it/them. There are in fact six blocks in the full file. If you only need the first, you can do this:
content = fileread( 'FFC_M7_1.txt' ) ;
data = regexp( content, 'Period .sec. PSA .g.\s*(.*?)\s*F', 'tokens', 'once' ) ;
data = reshape( sscanf( data{1}, '%f' ), 2, [] ).' ;
and you get data as a numeric array with the values. If you need all of them:
content = fileread( 'FFC_M7_1.txt' ) ;
data = regexp( content, 'Period .sec. PSA .g.\s*(.*?)\s*F', 'tokens' ) ;
for k = 1 : numel( data )
data{k} = reshape( sscanf( data{k}{1}, '%f' ), 2, [] ).' ;
end
Here data is a cell array that you can index with data{1}, data{2}, etc. Each cell (each data{k} contains a numeric array with the values.
This worked, thank you so much.
Cedric
Cedric on 27 Oct 2017
Edited: Cedric on 27 Oct 2017
My pleasure! Please [Accept] the answer if I solved your problem.
One more thing. I want to read the lines below (295656 to 295661) and only column 1 (depth) and 2 (PGA). Is there a way to modify the program above to do this?
Depth (m) PGA (g) Min. Displacement (m) Max. Displacement (m)
0 0.010817 -0.0775801344 0.0564587136
29.99994 0.00762791 -0.0540203136 0.0464908392
64.9998192 0.00844498 -0.034562796 0.0401955
122.9999064 0.00618368 -0.01831372512 0.02322947856
151.9997976 0.00420221 -0.01291852128 0.01819089072
269.9997648 0.0146314 -0.00774551664 0.00782610576
Yes, you can do it this way:
tokens = regexp( content, 'Depth \(m\)\s+PGA.*?[\r\n]+([^D]+)', 'tokens', 'once' ) ;
profileData = reshape( sscanf( tokens{1}, '%f' ), 4, [] ).'
With that you get:
profileData =
0 0.0077 -0.0340 0.0366
29.9999 0.0059 -0.0245 0.0274
64.9998 0.0066 -0.0191 0.0171
122.9999 0.0049 -0.0114 0.0139
151.9998 0.0037 -0.0093 0.0109
269.9998 0.0101 -0.0041 0.0057
I think the code extract the wrong PGA. The depth (column 1) are correct, but I can´t figure out where the PGA values comes from.
Cedric
Cedric on 2 Nov 2017
Edited: Cedric on 2 Nov 2017
We are probably not working with the same file. This is what I have:
You are right, I was comparing two different files. Sorry about that, and thanks a lot.
My pleasure!

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!