How can I read a specific range of lines from a text file without using a for loop?
Show older comments
I need to read data from text files (a lot of them), and they are formatted as follow:
.
.
.
internalField nonuniform List<scalar>
241920
(
0
0
0
0
0
.
.
.
0
0
0
)
;
.
.
.
The data I'm interested is the zero after the "(". The 241920 is the number of lines of data. The numbers are not necesarily 0 (if it matters, their values are constrained as
).
I want to get these numbers in an array. So far, I first read the line containing the number of data points (it is located in line 22 of the text file) to intialize the array, then I used textscan in a for loop to read the text file line by line starting from line 24. The problem is that this process is VERY slow, and I need to read hundreds of these files (the number of rows can vary, but is always specified in line 22)
Here is the for loop I'm using (I copied the textscan funtion from another post, I'm honestly not sure how it works)
for i=1:fieldSize
alpha(i) = str2double(string(textscan(fileID,'%s',1,'delimiter','\n', 'headerlines',(linenum+i-1)-1)));
fseek(fileID, 0, 'bof');
end
% Where fieldSize is the number found on line 22 as previously mentioned
% linenum is where the data starts (which is 24 for these text files). The -2 added to the linenum is just to match the code I got it from
% alpha is the array to where the data is being exported to
What I want to do then is to make this code very efficient, and to do that I believe I need to eliminate the for loop and use a function that can read a range of lines, not necesarily starting at the beginning of the text file.
EDIT:
I attached a sample text file. The first 22 lines are constant, just info file from the program that produced the text file.
4 Comments
Bob Thompson
on 26 Jun 2019
Edited: Bob Thompson
on 26 Jun 2019
Are those the onlyl times that parentheses come up in the file?
If so, you might be able to get away with some regexp stuff.
text = fileread('mytextfile.txt');
numbers = regexp(text,'[(]\s(.*)\s[)]','tokens');
From there you should end up with a cell that contains the stuff. You can split it into different elements with using regexp again.
numbers = regexp(numbers,'\s','split');
Then you should be able to use some cellfun, str2num, and cell2mat to make things into an array. You may have some troubles with cellfun because regexp has a tendency to bury things in a couple of levels deep in cells, but there are workarounds for that if needed.
Guillaume
on 26 Jun 2019
sscanf is probably going to be the fastest way to read your file. Can you post an example text file so we can give you the correct code.
Certainly, the code you're using is not going to be efficient. For a start, you pointlessly convert a cell array of char vector to a string array, which you then convert to an array of double. You could convert the cell array directly to double, but best is to read the numbers directly as numbers rather than text.
Dennis Bonilla
on 26 Jun 2019
Edited: Dennis Bonilla
on 26 Jun 2019
Walter Roberson
on 27 Jun 2019
Some tests we did about a week and a half ago showed that textscan is faster than fscanf.
Accepted Answer
More Answers (0)
Categories
Find more on Characters and Strings in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!