need help splitting a long series of data with date headers

2 views (last 30 days)
I have a large text file with burst data for a wave gauge. The data is a single column with thousands of data points separated by a date stamp like so:
4/9/2013,10:00
14.3
14.5
13.2
4/9/2013,11:00
12.6
13.4
15.1
etc.
I need to find a way to separate each burst to its own variable and save it to file with the date and time as the name. Is there any simple way to do this?

Answers (1)

Cedric
Cedric on 9 Apr 2013
Edited: Cedric on 10 Apr 2013
This question is more tricky than it really appears. The reason is that you can have months, days, and hours I guess, on one or two digits, and I assume that the number of data points can vary.
The standard approach is probably to use FGETL in a loop, test each line to determine whether it's a timestamp of a data point (it could be based on LENGTH if numbers are really stored using less than 12 chars), etc. Here is an example:
fin = fopen('dataLuis.txt', 'r') ; % Open input file (read mode).
fout = 0 ;
while ~feof(fin)
line = fgetl(fin) ; % Read line from input file.
if isempty(line), continue ; end % Discard if empty line.
if length(line) > 12
if fout ~= 0, fclose(fout) ; end % Close previous file if open.
fname = sprintf('%s.txt', line) ; % Build output file name.
fname = strrep(fname, '/', '-') ; % .. replace / and : that are
fname = strrep(fname, ':', '') ; % invalid in filenames.
fout = fopen(fname, 'w') ; % Open output file (write mode).
elseif fout > 0
fprintf(fout, '%s\r\n', line) ; % Output data line.
end
end
if fout > 0, fclose(fout) ; end % Close open files.
fclose(fin) ;
A less standard approach is based on REGEXP, using a pattern that identifies timestamps and splits the content around matches. Here is an example:
buffer = fileread('dataLuis.txt') ;
[m, s] = regexp(buffer, '[\w/,]*:..', 'match', 'split') ;
for k = 1 : numel(m)
fname = sprintf('%s.txt', m{k}) ;
fname = strrep(fname, '/', '-') ;
fname = strrep(fname, ':', '') ;
fout = fopen(fname, 'w') ;
fwrite(fout, strtrim(s{k+1})) ;
fclose(fout) ;
end
A more elaborate approach still based on REGEXP splits the file content into timestamps and data. Here is an example:
buffer = fileread('dataLuis.txt') ;
pattern = '(?<timestamp>[\w/,]*:..)\s*(?<data>.*?)(?=($|\s\d{1,2}/))' ;
content = regexp(buffer, pattern, 'names') ;
for k = 1 : numel(content)
fname = sprintf('%s.txt', content(k).timestamp) ;
fname = strrep(fname, '/', '-') ;
fname = strrep(fname, ':', '') ;
fout = fopen(fname, 'w') ;
fwrite(fout, content(k).data) ;
fclose(fout) ;
end

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!