How can I split a text file into many files?

38 views (last 30 days)
Addo
Addo on 19 Mar 2018
Commented: Walter Roberson on 2 Nov 2019
Hi,
I have some data in a text file that I need to be saved in separate files. The file pattern is repetead several times (but with different data of course) and is similar to the one shown here:
STARTSEC MYNAME
Some information text that
may vary from block to block
A 12
B 8
% some more data/text
1.0 27.5 123.1 13.1
1.1 17.4 121.1 12.7
1.2 21.7 131.9 11.6
1.3 31.4 142.6 11.1
... more data ...
ENDSECTION
STARTSEC MYNEXTNAME
Some other information text that
may vary from block to block
A 15
B 6
% some more data/text
1.0 28.1 123.1 21.7
1.1 26.5 124.9 22.5
1.2 22.0 131.4 14.8
1.3 21.8 140.5 16.7
... more data ...
ENDSECTION
% and so on ...
All "blocks" start and end with two keywords (in my example STARTSEC and ENDSECTION).
I would like to create a small script that writes all blocks in the original file into separate output files that contain only one block. Ideally these files would be named as the name after the STARTSEC command (i.e. first file = "MYNAME.txt", second file = "MYNEXTNAME.txt" and so on).
I've had a look at this example Matlab Answers: How to Split a Text File into Many Text Files ? but this doesn't work properly if I change "<" to "STARTSEC" as all output files are empty (and there are too many). Could someone help me?
Thanks and best regards!
  2 Comments
Walter Roberson
Walter Roberson on 19 Mar 2018
I suggest using the unix 'split' command with the '-p' (pattern) option.
Addo
Addo on 20 Mar 2018
Unfortunately I am working on a Windows PC

Sign in to comment.

Answers (3)

Jan
Jan on 20 Mar 2018
Edited: Jan on 20 Feb 2019
S = fileread('YourFile.txt');
C = strsplit(S, char(10));
ini = find(strncmp(C, 'STARTSEC ', 9));
fin = find(strncmp(C, 'ENDSECTION', 10));
for k = 1:numel(ini)
Head = C{ini(k)};
FileName = [Head(10:end), '.txt'];
fid = fopen(FileName, 'w');
if fid == -1
error('Cannot open file for writing: %s', FileName);
end
fprintf(fid, '%s\n', C{ini(k)+1:fin(k)-1});
fclose(fid);
end
  2 Comments
Addo
Addo on 8 Jun 2018
Hi Jan, thanks for your code. Unfortunately, this doesn't work. I think you did not correctly set up the string/character
C{ini(k)+1:fin(k)-1}
It shows the error message
Too many outputs requested. Most likely cause is missing [] around left hand side that has a comma separated list expansion.
Error in splitFiles (line 10)
Head = C{ini(k)};
But I don't see how to fix this. I've just found that
C{ini(k)}
returns zero and not any string or character.
Jan
Jan on 8 Jun 2018
@Addo: Please post your code after:
S = fileread('YourFile.txt');
C = strsplit(S, char(10));
it is not expected, that C{ini(k)} replies a "zero".
My code searches for lines starting with 'STARTSEC ' and with 'ENDSECTION' and then the lines in between are written to new files. If it does not work for you please post the input file and the code you are using.

Sign in to comment.


zhendong zhang
zhendong zhang on 20 Feb 2019
S = fileread('YourFile.txt');
C = strsplit(S, char(10));
ini = strncmp(C, 'STARTSEC ', 9);
fin = strncmp(C, 'ENDSECTION', 10);
ini_nonzero_indx = find(ini);
fin_nonzero_indx = find(fin);
for k = 1:numel(ini_nonzero_indx)
Head = C{ini_nonzero_indx(k)};
FileName = [Head(10:end-1), '.txt'];
fid = fopen(FileName, 'w');
if fid == -1
error('Cannot open file for writing: %s', FileName);
end
fprintf(fid, '%s\n', C{ini_nonzero_indx(k)+1:fin_nonzero_indx(k)-1});
fclose(fid);
end

Luiz Morales
Luiz Morales on 2 Nov 2019
Hi Jan and Zhang,
I tried your codes for the file attached, it creates the individual files even with the names I would like to have, but there is nothing inside, essentially I need the 4 columns of numbers (in the first case starting in line 5 up to 1004). Any thoughts?
thanks a lot
Luiz
  1 Comment
Walter Roberson
Walter Roberson on 2 Nov 2019
S = fileread('TEX_PH1.txt');
tpos = regexp(S, '^TEXTURE AT STRAIN =', lineanchors');
splitS = mat2cell(S,1,diff([tpos,length(S)+1]));
Now splitS is a cell array of character vectors, with each character vector being one block that begins with 'TEXTURE AT STRAIN =' (including that text). You can now textscan() or otherwise process each block to extract the content you want.

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!