Read text file line by line, and then store the information into a struct

9 views (last 30 days)
Hi everyone,
I am trying to read a file that is something like Data Sampling Rate: 256 Hz ***********************
Channels in EDF Files: ******************** Channel 1: FP1-F7 Channel 2: F7-T7 Channel 3: T7-P7 Channel 4: P7-O1
File Name: chb01_02.edf
File Start Time: 12:42:57
File End Time: 13:42:57
Number of Seizures in File: 0
File Name: chb01_03.edf
File Start Time: 13:43:04
File End Time: 14:43:04
Number of Seizures in File: 1
Seizure Start Time: 2996 seconds
Seizure End Time: 3036 seconds
so far i have:
fid1= fopen('chb01-summary.txt')
data=struct('id',{},'stime',{},'etime',{},'seizenum',{},'sseize',{},'eseize',{});
if fid1 ==-1
error('File cannot be opened ')
end
tline= fgetl(fid1);
while ischar(tline)
i=1;
disp(tline);
I want to use regexp to find the expressions and so far have
line1= '(.*\d{2} (\.edf)'
data{1} = regexp(tline, line1);
tline=fgetl(fid1);
time = '^Time: .*\d{2]}: \d{2} :\d{2}' ;
data{2}= regexp(tline,time);
tline=getl(fid1);
seizure = '^File: .*\d';
data{4}= regexp(tline,seizure);
if data{4}>0
stime = '^Time: .*\d{5}';
tline=getl(fid1);
data{5}= regexp(tline,seizure);
tline= getl(fid1);
data{6}= regexp(tline,seizure);
end
And I tried using a loop to find the line at which file name starts with if true for (firstline<1) (firstline>1 ) firstline= strfind(tline, 'File Name') tline=fgetl(fid1); end end
and then I am stumped. say I am at the line at which the information is there, how do i store the information with regexp? i got data= [] [] after running the code once...
Thanks in advance..

Accepted Answer

Cedric
Cedric on 23 Jul 2013
Edited: Cedric on 23 Jul 2013
I would go for a solution close to what you implemented, but splitting the file content first into blocks of data. For example:
buffer = fileread('chb01-summary.txt') ;
blocks = regexp(buffer, 'File Name', 'split') ;
if length(blocks) < 2, error('No content found in file.') ; end
blocks = blocks(2:end) ; % 1st block is header.
nBlocks = length(blocks) ;
results = cell(nBlocks, 1) ;
for bId = 1 : nBlocks
results{bId}.fileName = regexp(blocks{bId}, '\S+(?=\.edf)', 'match') ;
results{bId}.fileStartTime = regexp(blocks{bId}, ...
'(?<=File Start Time:\s*)\S+', 'match') ;
results{bId}.fileEndTime = regexp(blocks{bId}, ...
'(?<=File End Time:\s*)\S+', 'match') ;
results{bId}.nSeizures = str2double( regexp(blocks{bId}, ...
'(?<=in File:\s*)\d+', 'match') ) ;
if results{bId}.nSeizures > 0
results{bId}.seizures = regexp(blocks{bId}, ...
'Seizure Start Time: (?<startTime>\d+).+?Seizure End Time: (?<endTime>\d+)', 'names') ;
end
end
With that, you get:
>> results
results =
[1x1 struct]
[1x1 struct]
>> results{1}
ans =
fileName: {'chb01_02'}
fileStartTime: {'12:42:57'}
fileEndTime: {'13:42:57'}
nSeizures: 0
>> results{2}
ans =
fileName: {'chb01_03'}
fileStartTime: {'13:43:04'}
fileEndTime: {'14:43:04'}
nSeizures: 1
seizures: [1x1 struct]
>> results{2}.seizures
ans =
startTime: '2996'
endTime: '3036'
and what is left is probably a few conversions to numeric for relevant times.
Note that results{k}.seizures is a struct array, so if the 2nd entry in your file had been
File Name: chb01_03.edf
File Start Time: 13:43:04
File End Time: 14:43:04
Number of Seizures in File: 2
Seizure Start Time: 2996 seconds
Seizure End Time: 3036 seconds
Seizure Start Time: 2997 seconds
Seizure End Time: 3037 seconds
seizures times would be accessible through:
>> results{2}.seizures
ans =
1x2 struct array with fields:
startTime
endTime
>> results{2}.seizures(1)
ans =
startTime: '2996'
endTime: '3036'
>> results{2}.seizures(2)
ans =
startTime: '2997'
endTime: '3037'
>> results{2}.seizures(2).startTime
ans =
2997
EDIT: note that the regexp inside the IF statement in the FOR loop is extracting data using named tokens. I did that for the example, but you could also go for a more common approach, e.g.
if results{bId}.nSeizures > 0
startTimes = regexp(blocks{bId}, '(?<=Seizure Start Time:\s*)\d+', 'match') ;
endTimes = regexp(blocks{bId}, '(?<=Seizure End Time:\s*)\d+', 'match') ;
results{bId}.seizures = str2double([startTimes; endTimes].') ;
end

More Answers (0)

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!