Extracting multiple datasets from a text file

1 view (last 30 days)
BlueSky
BlueSky on 30 Dec 2015
Commented: Stephen23 on 11 Mar 2020
I am trying to extract data from a text file in the following format. 1st line (whatever characters it contains) needs to be the name a structure. each data set that I am trying to store in a given text file is preceded by a line description, part of that line needs to be the name of the sub matrix within that structure (see comments below). The order of datasets across multiple text files could vary... The number of data set is large (always mx3 though), I truncated it for a demo purpose here. I put comments as labels below for clarification...
The output would be one structure containing 5 matrices with a set of points stored within.
SAW_KNEE_Name % all of this needs to be the name of the structure
::PrepStep_TTTgeneric_fe_ Ant-Chmf_nar_IA_CR_fem %if this dataset containts Ant-Chmf, then create a sub matric within structure with name 'Ant-Chmf'
-51.402984619140625 -46.14201354980469 -157.0766143798828
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
::PrepStep_TTTgeneric_fe_ Ant_nar_IA_CR_fem %if this dataset containts Ant, then create a sub matric within structure with name 'Ant-Chmf'
-51.402984619140625 -46.14201354980469 -157.0766143798828
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
::PrepStep_TTTgeneric_fe_ Pst_nar_IA_CR_fem %if this dataset containts Pst, then create a sub matric within structure with name 'Pst'
-51.402984619140625 -46.14201354980469 -157.0766143798828
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
::PrepStep_TTTgeneric_fe_ Pst-Chmf_nar_IA_CR_fem %if this dataset containts Pst-Chmf, then create a sub matric within structure with name 'Pst-Chmf'
-51.402984619140625 -46.14201354980469 -157.0766143798828
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
::PrepStep_TTTgeneric_fe_ Dst_nar_IA_CR_fem %if this dataset containts Dst, then create a sub matric within structure with name 'Dst'
-51.402984619140625 -46.14201354980469 -157.0766143798828
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
  3 Comments
per isakson
per isakson on 30 Dec 2015
Edited: per isakson on 30 Dec 2015
"containts Ant, then create a sub matric within structure with name 'Ant-Chmf'" &nbsp does this contain more than one typing mistake?

Sign in to comment.

Answers (1)

per isakson
per isakson on 31 Dec 2015
Edited: per isakson on 3 Jan 2016
"the following format" &nbsp Your description is incomplete. Thus, I need to make a number of assumptions:
  1. the entire file fits in memory together with the resulting structure
  2. "::" appears only as the first and second character of the "line description".
  3. The "line description" consists of two text strings separated by one space. The characters left of the first underscore of the second text string controls the name of the structure field. ("-" is not allowed in Matlab names.)
  4. The comments, which are displayed in the example, don't exist in the data file.
Here is one way to read the file. Try
>> sas = cssm( 'SAW_KNEE_Name.txt' )
sas =
Name: 'SAW_KNEE_Name'
AntChmf: [-51.4030 -46.1420 -157.0766]
Ant: [3x3 double]
Pst: [2x3 double]
PstChmf: [2x3 double]
Dst: [4x3 double]
where
function sas = cssm( filespec )
%%SAW_KNEE_Name
cac = {
'Ant-Chmf', 'AntChmf'
'Ant' , 'Ant'
'Pst' , 'Pst'
'Pst-Chmf', 'PstChmf'
'Dst' , 'Dst'
};
field_names = containers.Map( cac(:,1), cac(:,2) );
str = fileread( filespec );
sas.Name = regexp( str, '^\S+', 'match', 'once' );
buffer = regexp( str, '(?<=::).+?(?=(::)|$)', 'match' );
for ca = buffer
key = regexp( ca{:}, '(?<=^\S+ )[^_]+', 'match', 'once' );
val = field_names( key );
num = textscan( ca{:}, '%f%f%f', 'Headerlines',1, 'CollectOutput',true );
sas.(val) = num{:};
end
end
and SAW_KNEE_Name.txt contains
SAW_KNEE_Name
::PrepStep_TTTgeneric_fe_ Ant-Chmf_nar_IA_CR_fem
-51.402984619140625 -46.14201354980469 -157.0766143798828
::PrepStep_TTTgeneric_fe_ Ant_nar_IA_CR_fem
-51.402984619140625 -46.14201354980469 -157.0766143798828
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
::PrepStep_TTTgeneric_fe_ Pst_nar_IA_CR_fem
-51.402984619140625 -46.14201354980469 -157.0766143798828
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
::PrepStep_TTTgeneric_fe_ Pst-Chmf_nar_IA_CR_fem
-51.402984619140625 -46.14201354980469 -157.0766143798828
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
::PrepStep_TTTgeneric_fe_ Dst_nar_IA_CR_fem
-51.402984619140625 -46.14201354980469 -157.0766143798828
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
-3.12049627304077148 -0.78153657913208008 1.52892947196960449
&nbsp
and finally
>> eval( [sas.Name, ' = sas;'] )
>> whos(sas.Name)
Name Size Bytes Class Attributes
SAW_KNEE_Name 1x1 1370 struct
which dpb wisely advised against. It really is asking for trouble to create variables this way.

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!