Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: how to read text file row/column info
Date: Sat, 15 Mar 2008 19:49:02 +0000 (UTC)
Organization: The MathWorks, Inc.
Lines: 92
Message-ID: <frh97e$22f$1@fred.mathworks.com>
References: <frfjcc$it5$1@fred.mathworks.com> <frg668$im6$1@fred.mathworks.com> <frgs93$1ov$1@fred.mathworks.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-05-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1205610542 2127 172.30.248.35 (15 Mar 2008 19:49:02 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Sat, 15 Mar 2008 19:49:02 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 869596
Xref: news.mathworks.com comp.soft-sys.matlab:457421



"Bruce Eddy" <sailboats@cfl.rr.com> wrote in message <frgs93
$1ov$1@fred.mathworks.com>...
> "Pekka " <pekka.nospam.kumpulainen@tut.please.fi> wrote 
in 
> message <frg668$im6$1@fred.mathworks.com>...
> > "Bruce Eddy" <sailboats@cfl.rr.com> wrote in message 
> > <frfjcc$it5$1@fred.mathworks.com>...
> > > Hi,
> > > I am trying to read a large .txt file with 5 columns 
> and 
> > > over 12000 rows of data.  I'm having trouble getting 
> the 
> > > (row, column) numbers right.  The code looks like 
this;
> > > 
> > > loop = 0;
> > > for i = 1:2
> > >     line1 = fgets(fid);
> > > end
> > > while feof(fid) == 0
> > >     loop = loop+1;
> > >     line1 = fgets(fid);
> > >     mass_prop(loop).name = line1(1, 1: 66);
> > >     mass_prop(loop).matl = line1(1,67: 102);
> > >     mass_prop(loop).volm = line1(1,103: 119);
> > >     mass_prop(loop).dens = line1(1,120:133);
> > >     mass_prop(loop).wght = line1(1,134:149);
> > > end
> > > 
> > > Can anyone help?
> > > Thanks.
> > 
> > Really hard to help much without knowing the structure 
of 
> > your file. Hard coding the indexing is generally not a 
> good 
> > idea. That will fail if any of the lines has different 
> > length in any of the fields.
> > 
> > Take a look at 
> > doc textscan 
> > that should read the entire file wthout loops and hard 
> > coded indexing.  
> > 
> I thought about that after the post.  The file is too big 
> to show but it is a mix of text and numbers in a 
columnated 
> form.  The first column is text with various indenting, 
the 
> second column is text, and the last 3 columns are 
numbers.  
> I've seen the limitation of the hard coding but didn't 
know 
> of another way.  My goal is to have a code that will read 
> different text files of the same general format.  Thanks 
> for the replies.  
> 

Bruce,

Does your data resemble this?:

Symbol, Exchange, Date, Time, Bid
CMGI, ARCA, 20040901, 9:31:33, 1.99
GE, ISLAND, 20040901, 9:31:34, 34.63
QQQQ, AUTO, 20040901, 9:31:35, 35.13

that is, do you have a mix of numeric and text columns of 
the same length with a header line and delimters as shown 
above?

If so you can use autotdataread.m from the File Exchange. 
It will determine the column type and build the format 
string for you. It also is very fast because it calls a 
native .dll directly. The data is placed in a structure of 
mixed type so you end up with:

data.Symbol
data.Exchange
data.Date...

..etc.

It is not very elegant code since it was literally the 
first m-file I ever wrote but it has worked for years 
without a revision and is used in the financial, 
engineering and medical communities. Its been tested up to 
1.2 GB ascii file size (with the 3GB switch enabled on 32-
bit stand-alone systems).

hth,
Scott