Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: really big data files
Date: Mon, 9 Nov 2009 00:00:19 +0000 (UTC)
Organization: Macaulay Brown, Inc
Lines: 22
Message-ID: <hd7m2j$i00$1@fred.mathworks.com>
References: <hd75si$m75$1@fred.mathworks.com> <dfca570a-9e21-4622-bdea-69768c9d26b4@p8g2000yqb.googlegroups.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-03-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1257724819 18432 172.30.248.38 (9 Nov 2009 00:00:19 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Mon, 9 Nov 2009 00:00:19 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1953538
Xref: news.mathworks.com comp.soft-sys.matlab:583441


Rune Allnor <allnor@tele.ntnu.no> wrote in message <dfca570a-9e21-4622-bdea-69768c9d26b4@p8g2000yqb.googlegroups.com>...
> On 8 Nov, 20:24, "Jon Shultz" <jjddshu...@yahoo.com> wrote:
> > I'm trying to read in a datafile that's really big (>2GB) in sections that are a couple hundred thousand lines long each. ?I need to know how many lines are in the parent file first. ?
> >
> > I have a routine now that does it like this:
> > totlines=0;
> > while ~feof(fid)
> > ? ? line=fgetl(fid);
> > ? ? totlines=totlines+1;
> > end
> >
> > This does well with the memory part, but takes forever. ?There has got to be a more efficient way to do this, but I'm stuck.
> 
> Read the file in larger batches than a single line.
> 
> Rune

Thank you.  I am using textscan to get the data blocks in the code which follows what I have written above.  Let me restate my question.  Is there a way to determine the number of lines in a large file without reading in the data (which will crash Matlab)?

I want to use the total number of lines to determine the best way to segment the files.  

Jon