Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
how to read text file row/column info

Subject: how to read text file row/column info

From: Bruce Eddy

Date: 15 Mar, 2008 04:30:04

Message: 1 of 7

Hi,
I am trying to read a large .txt file with 5 columns and
over 12000 rows of data. I'm having trouble getting the
(row, column) numbers right. The code looks like this;

loop = 0;
for i = 1:2
    line1 = fgets(fid);
end
while feof(fid) == 0
    loop = loop+1;
    line1 = fgets(fid);
    mass_prop(loop).name = line1(1, 1: 66);
    mass_prop(loop).matl = line1(1,67: 102);
    mass_prop(loop).volm = line1(1,103: 119);
    mass_prop(loop).dens = line1(1,120:133);
    mass_prop(loop).wght = line1(1,134:149);
end

Can anyone help?
Thanks.

Subject: how to read text file row/column info

From: Pekka

Date: 15 Mar, 2008 09:51:04

Message: 2 of 7

"Bruce Eddy" <sailboats@cfl.rr.com> wrote in message
<frfjcc$it5$1@fred.mathworks.com>...
> Hi,
> I am trying to read a large .txt file with 5 columns and
> over 12000 rows of data. I'm having trouble getting the
> (row, column) numbers right. The code looks like this;
>
> loop = 0;
> for i = 1:2
> line1 = fgets(fid);
> end
> while feof(fid) == 0
> loop = loop+1;
> line1 = fgets(fid);
> mass_prop(loop).name = line1(1, 1: 66);
> mass_prop(loop).matl = line1(1,67: 102);
> mass_prop(loop).volm = line1(1,103: 119);
> mass_prop(loop).dens = line1(1,120:133);
> mass_prop(loop).wght = line1(1,134:149);
> end
>
> Can anyone help?
> Thanks.

Really hard to help much without knowing the structure of
your file. Hard coding the indexing is generally not a good
idea. That will fail if any of the lines has different
length in any of the fields.

Take a look at
doc textscan
that should read the entire file wthout loops and hard
coded indexing.

Subject: how to read text file row/column info

From: Andres Toennesmann

Date: 15 Mar, 2008 12:32:08

Message: 3 of 7

"Bruce Eddy" <sailboats@cfl.rr.com> wrote in message
<frfjcc$it5$1@fred.mathworks.com>...
> Hi,
> I am trying to read a large .txt file with 5 columns and
> over 12000 rows of data. I'm having trouble getting the
> (row, column) numbers right.
> [x]

Hi Bruce,

if you want to import numeric data only, (my) txt2mat from
the file exchange should do easily. Maybe
A = txt2mat('c:\data\myfile.txt');
without any further options will work, if your file is quite
ordinarily structured. (You'll assign the columns of A to
your variables afterwards.)

So much about advertising. If it is mixed data (numbers and
strings), I recommend becoming acquainted to textscan, too.

Subject: how to read text file row/column info

From: Bruce Eddy

Date: 15 Mar, 2008 16:08:03

Message: 4 of 7

"Pekka " <pekka.nospam.kumpulainen@tut.please.fi> wrote in
message <frg668$im6$1@fred.mathworks.com>...
> "Bruce Eddy" <sailboats@cfl.rr.com> wrote in message
> <frfjcc$it5$1@fred.mathworks.com>...
> > Hi,
> > I am trying to read a large .txt file with 5 columns
and
> > over 12000 rows of data. I'm having trouble getting
the
> > (row, column) numbers right. The code looks like this;
> >
> > loop = 0;
> > for i = 1:2
> > line1 = fgets(fid);
> > end
> > while feof(fid) == 0
> > loop = loop+1;
> > line1 = fgets(fid);
> > mass_prop(loop).name = line1(1, 1: 66);
> > mass_prop(loop).matl = line1(1,67: 102);
> > mass_prop(loop).volm = line1(1,103: 119);
> > mass_prop(loop).dens = line1(1,120:133);
> > mass_prop(loop).wght = line1(1,134:149);
> > end
> >
> > Can anyone help?
> > Thanks.
>
> Really hard to help much without knowing the structure of
> your file. Hard coding the indexing is generally not a
good
> idea. That will fail if any of the lines has different
> length in any of the fields.
>
> Take a look at
> doc textscan
> that should read the entire file wthout loops and hard
> coded indexing.
>
I thought about that after the post. The file is too big
to show but it is a mix of text and numbers in a columnated
form. The first column is text with various indenting, the
second column is text, and the last 3 columns are numbers.
I've seen the limitation of the hard coding but didn't know
of another way. My goal is to have a code that will read
different text files of the same general format. Thanks
for the replies.

Subject: how to read text file row/column info

From: Scott Burnside

Date: 15 Mar, 2008 19:49:02

Message: 5 of 7

"Bruce Eddy" <sailboats@cfl.rr.com> wrote in message <frgs93
$1ov$1@fred.mathworks.com>...
> "Pekka " <pekka.nospam.kumpulainen@tut.please.fi> wrote
in
> message <frg668$im6$1@fred.mathworks.com>...
> > "Bruce Eddy" <sailboats@cfl.rr.com> wrote in message
> > <frfjcc$it5$1@fred.mathworks.com>...
> > > Hi,
> > > I am trying to read a large .txt file with 5 columns
> and
> > > over 12000 rows of data. I'm having trouble getting
> the
> > > (row, column) numbers right. The code looks like
this;
> > >
> > > loop = 0;
> > > for i = 1:2
> > > line1 = fgets(fid);
> > > end
> > > while feof(fid) == 0
> > > loop = loop+1;
> > > line1 = fgets(fid);
> > > mass_prop(loop).name = line1(1, 1: 66);
> > > mass_prop(loop).matl = line1(1,67: 102);
> > > mass_prop(loop).volm = line1(1,103: 119);
> > > mass_prop(loop).dens = line1(1,120:133);
> > > mass_prop(loop).wght = line1(1,134:149);
> > > end
> > >
> > > Can anyone help?
> > > Thanks.
> >
> > Really hard to help much without knowing the structure
of
> > your file. Hard coding the indexing is generally not a
> good
> > idea. That will fail if any of the lines has different
> > length in any of the fields.
> >
> > Take a look at
> > doc textscan
> > that should read the entire file wthout loops and hard
> > coded indexing.
> >
> I thought about that after the post. The file is too big
> to show but it is a mix of text and numbers in a
columnated
> form. The first column is text with various indenting,
the
> second column is text, and the last 3 columns are
numbers.
> I've seen the limitation of the hard coding but didn't
know
> of another way. My goal is to have a code that will read
> different text files of the same general format. Thanks
> for the replies.
>

Bruce,

Does your data resemble this?:

Symbol, Exchange, Date, Time, Bid
CMGI, ARCA, 20040901, 9:31:33, 1.99
GE, ISLAND, 20040901, 9:31:34, 34.63
QQQQ, AUTO, 20040901, 9:31:35, 35.13

that is, do you have a mix of numeric and text columns of
the same length with a header line and delimters as shown
above?

If so you can use autotdataread.m from the File Exchange.
It will determine the column type and build the format
string for you. It also is very fast because it calls a
native .dll directly. The data is placed in a structure of
mixed type so you end up with:

data.Symbol
data.Exchange
data.Date...

..etc.

It is not very elegant code since it was literally the
first m-file I ever wrote but it has worked for years
without a revision and is used in the financial,
engineering and medical communities. Its been tested up to
1.2 GB ascii file size (with the 3GB switch enabled on 32-
bit stand-alone systems).

hth,
Scott

Subject: how to read text file row/column info

From: Arthur G

Date: 15 Mar, 2008 19:48:37

Message: 6 of 7

On 2008-03-15 12:08:03 -0400, "Bruce Eddy" <sailboats@cfl.rr.com> said:

> "Pekka " <pekka.nospam.kumpulainen@tut.please.fi> wrote in
> message <frg668$im6$1@fred.mathworks.com>...
>> "Bruce Eddy" <sailboats@cfl.rr.com> wrote in message
>> <frfjcc$it5$1@fred.mathworks.com>...
>>> Hi,
>>> I am trying to read a large .txt file with 5 columns
> and
>>> over 12000 rows of data. I'm having trouble getting
> the
>>> (row, column) numbers right. The code looks like this;
>>>
>>> loop = 0;
>>> for i = 1:2
>>> line1 = fgets(fid);
>>> end
>>> while feof(fid) == 0
>>> loop = loop+1;
>>> line1 = fgets(fid);
>>> mass_prop(loop).name = line1(1, 1: 66);
>>> mass_prop(loop).matl = line1(1,67: 102);
>>> mass_prop(loop).volm = line1(1,103: 119);
>>> mass_prop(loop).dens = line1(1,120:133);
>>> mass_prop(loop).wght = line1(1,134:149);
>>> end
>>>
>>> Can anyone help?
>>> Thanks.
>>
>> Really hard to help much without knowing the structure of
>> your file. Hard coding the indexing is generally not a
> good
>> idea. That will fail if any of the lines has different
>> length in any of the fields.
>>
>> Take a look at
>> doc textscan
>> that should read the entire file wthout loops and hard
>> coded indexing.
>>
> I thought about that after the post. The file is too big
> to show but it is a mix of text and numbers in a columnated
> form. The first column is text with various indenting, the
> second column is text, and the last 3 columns are numbers.
> I've seen the limitation of the hard coding but didn't know
> of another way. My goal is to have a code that will read
> different text files of the same general format. Thanks
> for the replies.

I'm not sure what you mean by "various indenting", and that detail
could potentially make things more complicated. But one approach I've
used it to read a line at a time using fgetl, use textscan to parse
each column into strings (specifying the "Delimiter" as whatever
separates your columns), and then storing everything in a cell.
Afterward, I then try to convert every element of the cell into a
number (using sscanf seems fastest), and replace the string with a
number if the conversion is successful. That seems to be my best
compromise between speed and flexibility.

--Arthur

Subject: how to read text file row/column info

From: Bruce

Date: 16 Mar, 2008 00:45:05

Message: 7 of 7

Arthur G <gorramfreak+news@gmail.com> wrote in message
<47dc2815$0$289$b45e6eb0@senator-bedfellow.mit.edu>...
> On 2008-03-15 12:08:03 -0400, "Bruce Eddy"
<sailboats@cfl.rr.com> said:
>
> > "Pekka " <pekka.nospam.kumpulainen@tut.please.fi> wrote
in
> > message <frg668$im6$1@fred.mathworks.com>...
> >> "Bruce Eddy" <sailboats@cfl.rr.com> wrote in message
> >> <frfjcc$it5$1@fred.mathworks.com>...
> >>> Hi,
> >>> I am trying to read a large .txt file with 5 columns
> > and
> >>> over 12000 rows of data. I'm having trouble getting
> > the
> >>> (row, column) numbers right. The code looks like
this;
> >>>
> >>> loop = 0;
> >>> for i = 1:2
> >>> line1 = fgets(fid);
> >>> end
> >>> while feof(fid) == 0
> >>> loop = loop+1;
> >>> line1 = fgets(fid);
> >>> mass_prop(loop).name = line1(1, 1: 66);
> >>> mass_prop(loop).matl = line1(1,67: 102);
> >>> mass_prop(loop).volm = line1(1,103: 119);
> >>> mass_prop(loop).dens = line1(1,120:133);
> >>> mass_prop(loop).wght = line1(1,134:149);
> >>> end
> >>>
> >>> Can anyone help?
> >>> Thanks.
> >>
> >> Really hard to help much without knowing the structure
of
> >> your file. Hard coding the indexing is generally not a
> > good
> >> idea. That will fail if any of the lines has different
> >> length in any of the fields.
> >>
> >> Take a look at
> >> doc textscan
> >> that should read the entire file wthout loops and hard
> >> coded indexing.
> >>
> > I thought about that after the post. The file is too
big
> > to show but it is a mix of text and numbers in a
columnated
> > form. The first column is text with various indenting,
the
> > second column is text, and the last 3 columns are
numbers.
> > I've seen the limitation of the hard coding but didn't
know
> > of another way. My goal is to have a code that will
read
> > different text files of the same general format. Thanks
> > for the replies.
>
> I'm not sure what you mean by "various indenting", and
that detail
> could potentially make things more complicated. But one
approach I've
> used it to read a line at a time using fgetl, use
textscan to parse
> each column into strings (specifying the "Delimiter" as
whatever
> separates your columns), and then storing everything in a
cell.
> Afterward, I then try to convert every element of the
cell into a
> number (using sscanf seems fastest), and replace the
string with a
> number if the conversion is successful. That seems to be
my best
> compromise between speed and flexibility.
>
> --Arthur
>


Here is a sample of the file I'm trying to read, maybe it
will make more sense. The info is seperated into columns
with spaces between, no tabs or commas. The first column
is indented in groups (af's) and there are no spaces in the
text. The second column is text that has spaces in the
phrase which complicates it some. the other columns are
all numbers. Ultimately I need to read several files like
this that get pretty large so I'm hoping to come up with a
code that will fit all sizes. After reading the file the
goal is to process it to eliminate duplicates replacing
them with a single line and a quantity instead. Next I
need to sum one of the columns for all rows containing .prt.
Hopefully this is doable and not too ambitious.

Thanks for the inputs.

af_e_top_TIP_CURVE_WASHER.PRT NOT
ASSIGNED 0.003586 0.286000
0.001091
   af_e_RC_TIP_DBLR_STRIP.PRT NOT
ASSIGNED 0.040392 0.101000
0.004080
   af_e_top_TIP_CARD_CLVS.PRT NOT
ASSIGNED 0.014307 0.101000
0.001443
   af_e_RFRST_RC_LWR_FTG_BALL_AS.ASM NOT
ASSIGNED 4.231699 1.000000
0.438248
     af_e_RFRST_RC_LWR_FTG_AS.ASM NOT
ASSIGNED 4.203855 1.000000
0.430540
       MU_R_RFRST_RC_LWR_FTG.PRT NOT
ASSIGNED 4.171692 0.101000
0.437095
       HDW_MS51830_203.PRT NOT
ASSIGNED 0.032163 0.286000
0.004314
     dt_s_RFRST_TETHER_BAL_af.ASM NOT
ASSIGNED 0.026765 1.000000
0.007515
       dt_s_RFRST_TETHER_BALL_SK.PRT NOT
ASSIGNED 0.000000 1.000000
0.000000
       dt_s_RFRST_TETHER_BALL.PRT NOT
ASSIGNED 0.026288 0.280000
0.007361
       dt_SCD_RFRST_WIRE_ROP_SHT.PRT NOT
ASSIGNED 0.000477 0.286000
0.000154
     AF_S_RFRST_T_BALL_SPR_FER.PRT NOT
ASSIGNED 0.000516 0.098000
0.000051

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us