Parsing .txt File with Unique Format
Show older comments
Hi,
I am struggling to parse a .txt document. A sample format of the .txt document is attached. Essentially, I need to extract the first column, L, as well as the third to last column, MCA, into separate tables in MATLAB, one table for each sample. There are several thousands of lines of data, all in the aforementioned format. The sample text file attached has four samples (#S 1, #S 2, etc.).
I have a feeling my current approach is not efficient. I used the pound symbol as the delimiter, and got everything loaded into a cell with the following code. I even was able to ascertain the lines in which the "headers" were found. There must be a more straightforward way.
filename='sample.txt';
fid =fopen(filename);
c=textscan(fid,'%s','delimiter','#');
index=find(contains(c{1,1},'L X Scan H K L V Epoch Monitor Voltage Ion 2 Trans Ni Cu MCA Seconds Detector'));
fclose(fid);
I am a novice in MATLAB, and would really appreciate any help regarding this.

Accepted Answer
More Answers (1)
Walter Roberson
on 12 May 2022
2 votes
Cases like this are often most easily processed by reading the entire file as text and then using regexp() to extract information.
You might, however, be able to use textscan in a loop, making use of the CommentStyle option to skip the headers, and probably using a format repeat count of 1.
1 Comment
Nigel Caprotti
on 12 May 2022
Categories
Find more on Structured Data and XML Documents in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!