Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: how to save time loading data from file
Date: Tue, 5 Aug 2008 07:43:06 +0000 (UTC)
Organization: University of Memphis
Lines: 93
Message-ID: <g790aa$e6n$1@fred.mathworks.com>
References: <g75abr$40d$1@fred.mathworks.com> <4b983eef-8f6c-4eb8-87fd-961d5f24413d@w7g2000hsa.googlegroups.com>  <eab1c3ca-3e73-4e43-8325-b49adb4dcec0@p31g2000prf.googlegroups.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-03-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1217922186 14551 172.30.248.38 (5 Aug 2008 07:43:06 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Tue, 5 Aug 2008 07:43:06 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 233092
Xref: news.mathworks.com comp.soft-sys.matlab:483659



NZTideMan <mulgor@gmail.com> wrote in message
<eab1c3ca-3e73-4e43-8325-b49adb4dcec0@p31g2000prf.googlegroups.com>...
> On Aug 5, 6:59=A0pm, "Paul " <p...@ceri.memphis.edu> wrote:
> > "per isakson" <poi.nos...@bimDOTkthDOT.se> wrote in message
> >
> > <g784qq$o9...@fred.mathworks.com>...
> >
> >
> >
> >
> >
> > > "ggk " <ggkm...@comcast.net> wrote in message <g77uoa$a48
> > > $...@fred.mathworks.com>...
> > > > > The only thing that will speed things *significantly*
> > > > > is to store the data on binary format. Not too
long ago
> > > > > I sped up the loading of 90 MB of ASCII files from
~45 s
> > > > > to ~0.2 s by changing the storage format to
binary. Not
> > > > > only were the loading some 250x faster, I also saved
> > > some
> > > > > 20% disk space by storing the data on binary format.
> >
> > > > > Rune
> >
> > > > Fortunately the one thing I know exactly for my
> > > > application is the type of data, formatting and total #
> > > > rows per ascii file. My file size is 1.3GB
containing one
> > > > double-precision number per row and 100M rows. Because I
> > > > don't have enough memory to handle the 1.3GB of data,
> > > I've
> > > > split the ascii file up into 10 files of 10M rows each.
> >
> > > > Wow, loading binary files goes 250x faster!? How can I
> > > > convert my ascii file to binary? Any routines to do this
> > > > that can be called from Matlab?
> >
> > > I have some vague memories from the time when it was
> > > regarded wasteful to read formatted textfiles as "free
> > > text". It was much faster to read using an exact format-
> > > string, eg %10.4f rather than with %f. =A0
> >
> > > Some years ago I was surprised by the results of some
> > > experiments I did with Matlab: "%10.4f" was not faster
> > > than "%f". Conclusion: Matlab doesn't take full advantage
> > > of the information given in the format.
> >
> > > I guess that a carefully written piece of (e.g) fortran in
> > > a MEX-file would solve your problem. =A0 =A0
> >
> > > /per
> >
> > I may suggest a simple fortran program that would read the
> > ascii formatted data and create a new version of the data in
> > binary format, i.e. =A0one simple loop over all the
ascii data
> > (note: following code is not valid fortran but only a idea)
> >
> > 10 =A0read(7,'ascii',end=3D20) x
> > =A0 =A0 write(8,'binary')x
> > =A0 =A0 goto 10
> > 20 =A0end
> >
> > I would call this executable from Matlab and then just open
> > and read the new binary data with fread that can handle just
> > about any binary format. =A0- Hide quoted text -
> >
> > - Show quoted text -
> 
> Well, that's the sort of spaghetti code that gives Fortran
a bad name.
> I didn't know anyone wrote such code any more.
> Here's a modern version of the same thing:
> do
> read(7,*,iostat=3Dierr)x
> if(ierr .ne. 0)exit
> write(8)x
> end do
> 
> But even better would be to read in several lines of ASCII
data and
> write out blocks of binary.

Hey ... there are some spaghetti hounds around that can
still recall punch card machines, tape setups for plots and
getting your output from a wide slot in a wall!  

And I am a card carrying member of the GOTO society of
America since the version of fortran I used did not have an
ENDDO.