Split text file to smaller files

15 views (last 30 days)
I have a text file build up like this:
$header1
a b c d
e f g h
i j k l
$header2
1 b 2 a
3 c 4 d
..
I would like to split the file into smaller files where each file contains the information from one '$' to the next '$'. How would I go about doing this? Each file does not have a fixed amount of lines btw.

Accepted Answer

Nicolai NoName
Nicolai NoName on 3 Jan 2018
Thanks for your answers. I ended up getting help from a friend and the result we ended up with was:
data=['C:\somewhere\somefile.txt];
FID=fopen(data,'rt');
lumberoflines = 100000;
for i=1:lumberoflines
l = fgetl(FID);
if strcmp(l(1),'$')
target = l(2:end);
l = fgetl(FID);
exist FID2;
if true(ans)
fclose(FID2)
end
FID2 = fopen(target,'w');
end
fprintf(FID2,[l '\n']);
end
close all
This solution saved the individual files with names corresponding to the header.
  2 Comments
Guillaume
Guillaume on 3 Jan 2018
Edited: Guillaume on 3 Jan 2018
Glad you've got something that works for you. However, that is some badly cobbled together code. In particular, the lines
exist FID2
if true(ans)
should be replaced by
if exist('FID2')
but the whole concept of relying on the existence of a variable to know if a file is open is very iffy. Similarly, using a magic constant for the number of lines is just asking for trouble.
Also, since you're opening the file for reading the file using 'rt', you should open the destination files using 'wt'. Otherwise, you may end up with different line endings (which may or may not be a problem depending on which program you then use with the files)
See my edited answer for a cleaner and faster way to do exactly what you want.
MANISH R
MANISH R on 29 Sep 2022
@Nicolai NoName If there are 2 $ symbols instead of 1 will this code work? I have tried it with 2 $ but error comes as Unrecognized function or variable 'Fid2' in line fprintf(Fid2,[l '\n']);

Sign in to comment.

More Answers (2)

Guillaume
Guillaume on 3 Jan 2018
Edited: Guillaume on 3 Jan 2018
Personnally, I wouldn't bother with parsing the file, just read it in one go, then split it at the $:
wholefile = fileread('C:\somewhere\somefile.txt');
splitfiles = regexp(wholefile, '$[^$]+', 'match');
destnames = regexp(splitfiles, '(?<=$)[^\r\n]+', 'match', 'once');
for fileidx = 1:numel(splitfiles)
fid = fopen(fullfile('C:\somewhere', destnames{fileidx}), 'w');
fwrite(fid, splitfiles{fileidx});
fclose(fid);
end
edited to extract destination name from $ expression

KSSV
KSSV on 3 Jan 2018
fid = fopen('data.txt','r') ;
S = textscan(fid,'%s','delimiter','\n') ;
fclose(fid) ;
S = S{1} ;
% find $ location
idx = find(contains(S,'$')) ;
N = length(idx) ;
iwant = cell(N,1) ;
for i = 1:N-1
iwant{i} = S(idx(i)+1:idx(2)-1) ;
end
iwant{N} = S(idx(N)+1:end) ;

Categories

Find more on Text Data Preparation in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!