Path: news.mathworks.com!newsfeed-00.mathworks.com!newsfeed2.dallas1.level3.net!news.level3.com!postnews.google.com!a8g2000prf.googlegroups.com!not-for-mail
From: NZTideMan <mulgor@gmail.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: how to save time loading data from file
Date: Tue, 5 Aug 2008 04:19:39 -0700 (PDT)
Organization: http://groups.google.com
Lines: 103
Message-ID: <b46cf8fc-616b-4b87-9795-4defd90262cb@a8g2000prf.googlegroups.com>
References: <g75abr$40d$1@fred.mathworks.com> <4b983eef-8f6c-4eb8-87fd-961d5f24413d@w7g2000hsa.googlegroups.com> 
NNTP-Posting-Host: 202.78.152.105
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1217935179 29591 127.0.0.1 (5 Aug 2008 11:19:39 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Tue, 5 Aug 2008 11:19:39 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: a8g2000prf.googlegroups.com; posting-host=202.78.152.105; 
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Maxthon; 
X-HTTP-Via: 1.1 nc5 (NetCache NetApp/6.0.5P1)
Xref: news.mathworks.com comp.soft-sys.matlab:483690



On Aug 5, 7:43=A0pm, "Paul " <p...@ceri.memphis.edu> wrote:
> NZTideMan <mul...@gmail.com> wrote in message
>
> <eab1c3ca-3e73-4e43-8325-b49adb4dc...@p31g2000prf.googlegroups.com>...
>
>
>
>
>
> > On Aug 5, 6:59=3DA0pm, "Paul " <p...@ceri.memphis.edu> wrote:
> > > "per isakson" <poi.nos...@bimDOTkthDOT.se> wrote in message
>
> > > <g784qq$o9...@fred.mathworks.com>...
>
> > > > "ggk " <ggkm...@comcast.net> wrote in message <g77uoa$a48
> > > > $...@fred.mathworks.com>...
> > > > > > The only thing that will speed things *significantly*
> > > > > > is to store the data on binary format. Not too
> long ago
> > > > > > I sped up the loading of 90 MB of ASCII files from
> ~45 s
> > > > > > to ~0.2 s by changing the storage format to
> binary. Not
> > > > > > only were the loading some 250x faster, I also saved
> > > > some
> > > > > > 20% disk space by storing the data on binary format.
>
> > > > > > Rune
>
> > > > > Fortunately the one thing I know exactly for my
> > > > > application is the type of data, formatting and total #
> > > > > rows per ascii file. My file size is 1.3GB
> containing one
> > > > > double-precision number per row and 100M rows. Because I
> > > > > don't have enough memory to handle the 1.3GB of data,
> > > > I've
> > > > > split the ascii file up into 10 files of 10M rows each.
>
> > > > > Wow, loading binary files goes 250x faster!? How can I
> > > > > convert my ascii file to binary? Any routines to do this
> > > > > that can be called from Matlab?
>
> > > > I have some vague memories from the time when it was
> > > > regarded wasteful to read formatted textfiles as "free
> > > > text". It was much faster to read using an exact format-
> > > > string, eg %10.4f rather than with %f. =3DA0
>
> > > > Some years ago I was surprised by the results of some
> > > > experiments I did with Matlab: "%10.4f" was not faster
> > > > than "%f". Conclusion: Matlab doesn't take full advantage
> > > > of the information given in the format.
>
> > > > I guess that a carefully written piece of (e.g) fortran in
> > > > a MEX-file would solve your problem. =3DA0 =3DA0
>
> > > > /per
>
> > > I may suggest a simple fortran program that would read the
> > > ascii formatted data and create a new version of the data in
> > > binary format, i.e. =3DA0one simple loop over all the
> ascii data
> > > (note: following code is not valid fortran but only a idea)
>
> > > 10 =3DA0read(7,'ascii',end=3D3D20) x
> > > =3DA0 =3DA0 write(8,'binary')x
> > > =3DA0 =3DA0 goto 10
> > > 20 =3DA0end
>
> > > I would call this executable from Matlab and then just open
> > > and read the new binary data with fread that can handle just
> > > about any binary format. =3DA0- Hide quoted text -
>
> > > - Show quoted text -
>
> > Well, that's the sort of spaghetti code that gives Fortran
> a bad name.
> > I didn't know anyone wrote such code any more.
> > Here's a modern version of the same thing:
> > do
> > read(7,*,iostat=3D3Dierr)x
> > if(ierr .ne. 0)exit
> > write(8)x
> > end do
>
> > But even better would be to read in several lines of ASCII
> data and
> > write out blocks of binary.
>
> Hey ... there are some spaghetti hounds around that can
> still recall punch card machines, tape setups for plots and
> getting your output from a wide slot in a wall! =A0
>
> And I am a card carrying member of the GOTO society of
> America since the version of fortran I used did not have an
> ENDDO.- Hide quoted text -
>
> - Show quoted text -

Yes, I was spaghetti hound myself until I had to debug someone else's
spaghetti code one day and it was like my personal road to Damascus.
I vowed never to write code like that again.
And I also remember Hollerith cards and JCL and getting print-plots on
fanfold paper 132 columns wide......