Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: how to save time loading data from file
Date: Mon, 4 Aug 2008 23:54:02 +0000 (UTC)
Organization: KTH
Lines: 39
Message-ID: <g784qq$o9e$1@fred.mathworks.com>
References: <g75abr$40d$1@fred.mathworks.com> <4b983eef-8f6c-4eb8-87fd-961d5f24413d@w7g2000hsa.googlegroups.com> <g77uoa$a48$1@fred.mathworks.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-03-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1217894042 24878 172.30.248.38 (4 Aug 2008 23:54:02 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Mon, 4 Aug 2008 23:54:02 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1670
Xref: news.mathworks.com comp.soft-sys.matlab:483614



"ggk " <ggkmath@comcast.net> wrote in message <g77uoa$a48
$1@fred.mathworks.com>...
> > The only thing that will speed things *significantly*
> > is to store the data on binary format. Not too long ago
> > I sped up the loading of 90 MB of ASCII files from ~45 s
> > to ~0.2 s by changing the storage format to binary. Not
> > only were the loading some 250x faster, I also saved 
some
> > 20% disk space by storing the data on binary format.
> > 
> > Rune
> 
> Fortunately the one thing I know exactly for my 
> application is the type of data, formatting and total # 
> rows per ascii file. My file size is 1.3GB containing one 
> double-precision number per row and 100M rows. Because I 
> don't have enough memory to handle the 1.3GB of data, 
I've 
> split the ascii file up into 10 files of 10M rows each. 
> 
> Wow, loading binary files goes 250x faster!? How can I 
> convert my ascii file to binary? Any routines to do this 
> that can be called from Matlab?

I have some vague memories from the time when it was 
regarded wasteful to read formatted textfiles as "free 
text". It was much faster to read using an exact format-
string, eg %10.4f rather than with %f.  

Some years ago I was surprised by the results of some 
experiments I did with Matlab: "%10.4f" was not faster 
than "%f". Conclusion: Matlab doesn't take full advantage 
of the information given in the format.

I guess that a carefully written piece of (e.g) fortran in 
a MEX-file would solve your problem.    

/per