Path: news.mathworks.com!newsfeed-00.mathworks.com!newsfeed2.dallas1.level3.net!news.level3.com!postnews.google.com!p31g2000prf.googlegroups.com!not-for-mail
From: NZTideMan <mulgor@gmail.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: how to save time loading data from file
Date: Tue, 5 Aug 2008 00:24:08 -0700 (PDT)
Organization: http://groups.google.com
Lines: 75
Message-ID: <eab1c3ca-3e73-4e43-8325-b49adb4dcec0@p31g2000prf.googlegroups.com>
References: <g75abr$40d$1@fred.mathworks.com> <4b983eef-8f6c-4eb8-87fd-961d5f24413d@w7g2000hsa.googlegroups.com> 
NNTP-Posting-Host: 202.78.152.105
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1217921048 6365 127.0.0.1 (5 Aug 2008 07:24:08 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Tue, 5 Aug 2008 07:24:08 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: p31g2000prf.googlegroups.com; posting-host=202.78.152.105; 
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Maxthon; 
Xref: news.mathworks.com comp.soft-sys.matlab:483655



On Aug 5, 6:59=A0pm, "Paul " <p...@ceri.memphis.edu> wrote:
> "per isakson" <poi.nos...@bimDOTkthDOT.se> wrote in message
>
> <g784qq$o9...@fred.mathworks.com>...
>
>
>
>
>
> > "ggk " <ggkm...@comcast.net> wrote in message <g77uoa$a48
> > $...@fred.mathworks.com>...
> > > > The only thing that will speed things *significantly*
> > > > is to store the data on binary format. Not too long ago
> > > > I sped up the loading of 90 MB of ASCII files from ~45 s
> > > > to ~0.2 s by changing the storage format to binary. Not
> > > > only were the loading some 250x faster, I also saved
> > some
> > > > 20% disk space by storing the data on binary format.
>
> > > > Rune
>
> > > Fortunately the one thing I know exactly for my
> > > application is the type of data, formatting and total #
> > > rows per ascii file. My file size is 1.3GB containing one
> > > double-precision number per row and 100M rows. Because I
> > > don't have enough memory to handle the 1.3GB of data,
> > I've
> > > split the ascii file up into 10 files of 10M rows each.
>
> > > Wow, loading binary files goes 250x faster!? How can I
> > > convert my ascii file to binary? Any routines to do this
> > > that can be called from Matlab?
>
> > I have some vague memories from the time when it was
> > regarded wasteful to read formatted textfiles as "free
> > text". It was much faster to read using an exact format-
> > string, eg %10.4f rather than with %f. =A0
>
> > Some years ago I was surprised by the results of some
> > experiments I did with Matlab: "%10.4f" was not faster
> > than "%f". Conclusion: Matlab doesn't take full advantage
> > of the information given in the format.
>
> > I guess that a carefully written piece of (e.g) fortran in
> > a MEX-file would solve your problem. =A0 =A0
>
> > /per
>
> I may suggest a simple fortran program that would read the
> ascii formatted data and create a new version of the data in
> binary format, i.e. =A0one simple loop over all the ascii data
> (note: following code is not valid fortran but only a idea)
>
> 10 =A0read(7,'ascii',end=3D20) x
> =A0 =A0 write(8,'binary')x
> =A0 =A0 goto 10
> 20 =A0end
>
> I would call this executable from Matlab and then just open
> and read the new binary data with fread that can handle just
> about any binary format. =A0- Hide quoted text -
>
> - Show quoted text -

Well, that's the sort of spaghetti code that gives Fortran a bad name.
I didn't know anyone wrote such code any more.
Here's a modern version of the same thing:
do
read(7,*,iostat=3Dierr)x
if(ierr .ne. 0)exit
write(8)x
end do

But even better would be to read in several lines of ASCII data and
write out blocks of binary.