<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956</link>
    <title>MATLAB Central Newsreader - read ascii file</title>
    <description>Feed for thread: read ascii file</description>
    <language>en-us</language>
    <copyright>&amp;copy;1994-2012 by MathWorks, Inc.</copyright>
    <webmaster>webmaster@mathworks.com</webmaster>
    <generator>MATLAB Central Newsreader</generator>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <ttl>60</ttl>
    <image>
      <title>MathWorks</title>
      <url>http://www.mathworks.com/images/membrane_icon.gif</url>
    </image>
    <item>
      <pubDate>Mon, 10 Nov 2008 09:53:02 -0500</pubDate>
      <title>read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#609960</link>
      <author>DS </author>
      <description>Hello all.  I need to read in an ascii file, with mixed char and numeric data, and I'm reading fairly big files so I would like it to be fast.  The files look something like this:&lt;br&gt;
&lt;br&gt;
{{0,0,0, ... lots of numbers ... ,0,0,0},{0,0,0, ... lots of numbers ... ,0,0,0}, etc }&lt;br&gt;
&lt;br&gt;
It's a square matrix all on a single line, comma delimited, with each row and the entire matrix enclosed in curly braces.&lt;br&gt;
&lt;br&gt;
I can manage a for-loop &quot;hack all the braces out&quot; approach, but is there a better way for something with this simple format?&lt;br&gt;
&lt;br&gt;
-DS</description>
    </item>
    <item>
      <pubDate>Mon, 10 Nov 2008 12:08:02 -0500</pubDate>
      <title>read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#609986</link>
      <author>Andres </author>
      <description>&quot;DS&quot; &amp;lt;null@null.com&amp;gt; wrote in message &amp;lt;gf909u$4n9$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; [..] The files look something like this:&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; {{0,0,0, ... lots of numbers ... ,0,0,0},{0,0,0, ... lots of numbers ... ,0,0,0}, etc }&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; It's a square matrix all on a single line, comma delimited, with each row and the entire matrix enclosed in curly braces.&lt;br&gt;
[..]&lt;br&gt;
.&lt;br&gt;
.&lt;br&gt;
Hi,&lt;br&gt;
if the curly braces are your only char data, i.e. inside the braces there are just numbers, you could do the trick with txt2mat (file exchange):&lt;br&gt;
.&lt;br&gt;
A = txt2mat('file.txt',...&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;'ReplaceExpr',{{'},{',char([13 10])}},...&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;'ReplaceChar',{'{}, '});&lt;br&gt;
.&lt;br&gt;
I assumed that &lt;br&gt;
- you don't know the size of the matrix before (which would help to speed things up)&lt;br&gt;
- the rows are reliably separated by '},{'&lt;br&gt;
.&lt;br&gt;
I checked this on a sample file containing the only line&lt;br&gt;
{{1,2,3,4},{5,6,7,8},{9,10,11,12},{13,14,15,16}}&lt;br&gt;
Of course,&lt;br&gt;
.&lt;br&gt;
B = txt2mat('file.txt','ReplaceChar',{'{}, '});&lt;br&gt;
n = sqrt(numel(B));&lt;br&gt;
B = reshape(B,n,n).';&lt;br&gt;
.&lt;br&gt;
would work as well.&lt;br&gt;
Ok, this is kind of hacking the braces out, but it should be quite fast.&lt;br&gt;
Hth&lt;br&gt;
Andres</description>
    </item>
    <item>
      <pubDate>Mon, 10 Nov 2008 12:12:50 -0500</pubDate>
      <title>Re: read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#609989</link>
      <author>Rune Allnor</author>
      <description>On 10 Nov, 10:53, &quot;DS&quot; &amp;lt;n...@null.com&amp;gt; wrote:&lt;br&gt;
&amp;gt; Hello all. =A0I need to read in an ascii file, with mixed char and numeri=&lt;br&gt;
c data, and I'm reading fairly big files so I would like it to be fast.&lt;br&gt;
&lt;br&gt;
&quot;Text data&quot; and &quot;fast access&quot; are contradictions in terms.&lt;br&gt;
Expect 2-5s delay per 10 MByte of text data in the file.&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&amp;gt;=A0The files look something like this:&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; {{0,0,0, ... lots of numbers ... ,0,0,0},{0,0,0, ... lots of numbers ... =&lt;br&gt;
,0,0,0}, etc }&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; It's a square matrix all on a single line, comma delimited, with each row=&lt;br&gt;
&amp;nbsp;and the entire matrix enclosed in curly braces.&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; I can manage a for-loop &quot;hack all the braces out&quot; approach, but is there =&lt;br&gt;
a better way for something with this simple format?&lt;br&gt;
&lt;br&gt;
Regular expressions is the obvious first try.&lt;br&gt;
&lt;br&gt;
Rune</description>
    </item>
    <item>
      <pubDate>Mon, 10 Nov 2008 12:57:02 -0500</pubDate>
      <title>Re: read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#609998</link>
      <author>Andres </author>
      <description>Rune Allnor &amp;lt;allnor@tele.ntnu.no&amp;gt; wrote in message &amp;lt;e2d1e726-82c5-4d32-865d-ab0700e0f092@r36g2000prf.googlegroups.com&amp;gt;...&lt;br&gt;
&lt;br&gt;
&amp;gt; Regular expressions is the obvious first try.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Rune&lt;br&gt;
.&lt;br&gt;
Hi Rune,&lt;br&gt;
imho regular expressions are quite slow, and this could be noticeable for large files. If I had the choice, I'd just replace the braces with spaces.&lt;br&gt;
Regards&lt;br&gt;
Andres</description>
    </item>
    <item>
      <pubDate>Mon, 10 Nov 2008 13:44:03 -0500</pubDate>
      <title>Re: read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#610008</link>
      <author>Rune Allnor</author>
      <description>On 10 Nov, 13:57, &quot;Andres&quot; &amp;lt;rant...@werb.deNoRs&amp;gt; wrote:&lt;br&gt;
&amp;gt; Rune Allnor &amp;lt;all...@tele.ntnu.no&amp;gt; wrote in message &amp;lt;e2d1e726-82c5-4d32-865d-ab0700e0f...@r36g2000prf.googlegroups.com&amp;gt;...&lt;br&gt;
&amp;gt; &amp;gt; Regular expressions is the obvious first try.&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; Rune&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; .&lt;br&gt;
&amp;gt; Hi Rune,&lt;br&gt;
&amp;gt; imho regular expressions are quite slow, and this could be noticeable for large files. If I had the choice, I'd just replace the braces with spaces.&lt;br&gt;
&lt;br&gt;
The TXT2MAT function you suggested earlier uses&lt;br&gt;
a syntax which is deceptively similar to a regular&lt;br&gt;
expression. I can't find any documentation for the&lt;br&gt;
function, though, so I don't know how it is implemented.&lt;br&gt;
&lt;br&gt;
As for text files, it's very time-consuming to mess&lt;br&gt;
with them. The only *real* time saving is to use a&lt;br&gt;
binary format. This was discussed here not too long&lt;br&gt;
ago:&lt;br&gt;
&lt;br&gt;
&lt;a href=&quot;http://groups.google.no/group/comp.soft-sys.matlab/msg/d49639538f61a0dc?hl=no&quot;&gt;http://groups.google.no/group/comp.soft-sys.matlab/msg/d49639538f61a0dc?hl=no&lt;/a&gt;&lt;br&gt;
&lt;br&gt;
Rune</description>
    </item>
    <item>
      <pubDate>Mon, 10 Nov 2008 13:53:01 -0500</pubDate>
      <title>Re: read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#610013</link>
      <author>Andres </author>
      <description>on the speed...&lt;br&gt;
.&lt;br&gt;
for a 1000x1000 matrix file counting from 1 to 1e6 ('{{1,2,3,...', ~6.7Mb),&lt;br&gt;
.&lt;br&gt;
tic&lt;br&gt;
B = txt2mat('ds_1000.txt',0,-1,'ReplaceChar',{'{}, '});&lt;br&gt;
n = sqrt(numel(B));&lt;br&gt;
B = reshape(B,n,n).';&lt;br&gt;
toc&lt;br&gt;
.&lt;br&gt;
takes about one second. (The &quot;0,-1&quot; args switch off the file layout detection which is necessary for lines &amp;gt;64kB)</description>
    </item>
    <item>
      <pubDate>Mon, 10 Nov 2008 14:03:02 -0500</pubDate>
      <title>Re: read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#610016</link>
      <author>Andres </author>
      <description>Rune Allnor &amp;lt;allnor@tele.ntnu.no&amp;gt; wrote in message &amp;lt;008a086e-9fd0-45d6-b4ef-b3aef5c7755d@a17g2000prm.googlegroups.com&amp;gt;...&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; The TXT2MAT function you suggested earlier uses&lt;br&gt;
&amp;gt; a syntax which is deceptively similar to a regular&lt;br&gt;
&amp;gt; expression. I can't find any documentation for the&lt;br&gt;
&amp;gt; function, though, so I don't know how it is implemented.&lt;br&gt;
.&lt;br&gt;
there's quite a lengthy doc&lt;br&gt;
.&lt;br&gt;
&amp;gt; As for text files, it's very time-consuming to mess&lt;br&gt;
&amp;gt; with them. The only *real* time saving is to use a&lt;br&gt;
&amp;gt; binary format. [..]&lt;br&gt;
.&lt;br&gt;
I fully agree. But often enough, you don't have any choice of the format of the data that is given to you.&lt;br&gt;
.&lt;br&gt;
(sorry for the &quot;.&quot;-lines - empty lines are not displayed here)</description>
    </item>
    <item>
      <pubDate>Mon, 10 Nov 2008 14:19:13 -0500</pubDate>
      <title>Re: read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#610024</link>
      <author>Rune Allnor</author>
      <description>On 10 Nov, 15:03, &quot;Andres&quot; &amp;lt;rant...@werb.deNoRs&amp;gt; wrote:&lt;br&gt;
&amp;gt; Rune Allnor &amp;lt;all...@tele.ntnu.no&amp;gt; wrote in message &amp;lt;008a086e-9fd0-45d6-b4ef-b3aef5c77...@a17g2000prm.googlegroups.com&amp;gt;...&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; The TXT2MAT function you suggested earlier uses&lt;br&gt;
&amp;gt; &amp;gt; a syntax which is deceptively similar to a regular&lt;br&gt;
&amp;gt; &amp;gt; expression. I can't find any documentation for the&lt;br&gt;
&amp;gt; &amp;gt; function, though, so I don't know how it is implemented.&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; .&lt;br&gt;
&amp;gt; there's quite a lengthy doc&lt;br&gt;
&lt;br&gt;
Where? I can't find it in my R2006a release, and&lt;br&gt;
I can't find it among the mathworks list of functions.&lt;br&gt;
&lt;br&gt;
&amp;gt; I fully agree. But often enough, you don't have any choice of the format of the data that is given to you.&lt;br&gt;
&lt;br&gt;
Fair enough. My point is: Don't complain about speed&lt;br&gt;
when you deal with text files. If speed really is a&lt;br&gt;
concern, use a binary format. If text files is what you&lt;br&gt;
have, don't discard regular expressions on account&lt;br&gt;
of speed.&lt;br&gt;
&lt;br&gt;
Rune</description>
    </item>
    <item>
      <pubDate>Mon, 10 Nov 2008 15:08:03 -0500</pubDate>
      <title>Re: read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#610040</link>
      <author>Andres </author>
      <description>Rune Allnor &amp;lt;allnor@tele.ntnu.no&amp;gt; wrote in message &amp;lt;4077b49b-c131-4bdf-8401-2ba9e5698b39@a17g2000prm.googlegroups.com&amp;gt;...&lt;br&gt;
[..]&lt;br&gt;
&amp;gt; Where? I can't find it in my R2006a release, and&lt;br&gt;
&amp;gt; I can't find it among the mathworks list of functions.&lt;br&gt;
&amp;gt; &lt;br&gt;
As I noted, it can be found on the file exchange. I'm the author.&lt;br&gt;
.&lt;br&gt;
&amp;gt; [..] If text files is what you&lt;br&gt;
&amp;gt; have, don't discard regular expressions on account&lt;br&gt;
&amp;gt; of speed.&lt;br&gt;
.&lt;br&gt;
I don't want to discard them in general, I just thought they are not necessary here. To my experience, the replacement process is slowed down by a factor of ~5 with regular expressions, which might be important to the OP who &quot;would like it to be fast&quot;, e.g. if he has many files to import - regardless of how much faster a binary import would be.&lt;br&gt;
Regards&lt;br&gt;
Andres</description>
    </item>
    <item>
      <pubDate>Mon, 10 Nov 2008 16:23:02 -0500</pubDate>
      <title>read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#610068</link>
      <author>DS </author>
      <description>Rune and Andres -- Thank you both for the helpful input.&lt;br&gt;
&lt;br&gt;
I got Andres' file exchange code TXT2MAT working as per the following:&lt;br&gt;
&lt;br&gt;
B = txt2mat(file,'ReadMode','block','NumColumns',1248,'ReplaceChar',{'{}, '});&lt;br&gt;
n = sqrt(numel(B));&lt;br&gt;
B = reshape(B,n,n).';&lt;br&gt;
&lt;br&gt;
I'd rather not have to throw in the magic number there (1248), but apparently the long single line gets read in as ~25 lines and the data gets all twisted when I let TXT2MAT try to figure it out.  At any rate, it's faster and cleaner than my hack and slash approach:&lt;br&gt;
&lt;br&gt;
%-----------------------------------&lt;br&gt;
%read entire file as cell string&lt;br&gt;
a = textread('file.txt','%s','delimiter',',');&lt;br&gt;
%search for first '}' character (indicates the end of a column)&lt;br&gt;
for count=1:length(a)&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if ~isempty(strfind(cell2mat(a(count)),'}'))&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;break&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;end&lt;br&gt;
end&lt;br&gt;
ncols = count;&lt;br&gt;
nrows = length(a)/ncols;&lt;br&gt;
%clean braces '{' and '}' from data&lt;br&gt;
a = strrep(a,'{','');&lt;br&gt;
a = strrep(a,'}','');&lt;br&gt;
%convert cell array to char&lt;br&gt;
a = char(a);&lt;br&gt;
%convert string array to numeric&lt;br&gt;
a = str2num(a);&lt;br&gt;
%reshape matrix&lt;br&gt;
a = reshape(a,ncols,nrows);&lt;br&gt;
%-----------------------------------</description>
    </item>
    <item>
      <pubDate>Mon, 10 Nov 2008 20:05:06 -0500</pubDate>
      <title>read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#610139</link>
      <author>Andres </author>
      <description>If it works - it's fine.&lt;br&gt;
.&lt;br&gt;
But I'm a bit puzzled by the need for the magic number, too, which is not even square. Did you try my latter code which I tested on the one million numbers file?&lt;br&gt;
Just if you like, contact me via the file exchange author page, I'd be curious to look into detail.&lt;br&gt;
Regards&lt;br&gt;
Andres</description>
    </item>
    <item>
      <pubDate>Mon, 10 Nov 2008 21:22:02 -0500</pubDate>
      <title>read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#610161</link>
      <author>DS </author>
      <description>&quot;Andres&quot; &amp;lt;rantore@werb.deNoRs&amp;gt; wrote in message &amp;lt;gfa45i$kk7$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;nbsp;Did you try my latter code which I tested on the one million numbers file?&lt;br&gt;
---&lt;br&gt;
I tried your latter code, and I have the same trouble.  I'm sure it would work fine if the data were well formatted; the data is a continuous block of characters with no line-feeds to delimit the rows.  I think this is giving TXT2MAT the wrong idea about how the data should be parsed.&lt;br&gt;
.&lt;br&gt;
I'll try to send you a sample file to play with if you're curious.&lt;br&gt;
-DS</description>
    </item>
    <item>
      <pubDate>Wed, 12 Nov 2008 12:08:01 -0500</pubDate>
      <title>read ascii file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/238956#610448</link>
      <author>Andres </author>
      <description>&quot;DS&quot; &amp;lt;null@null.com&amp;gt; wrote in message &amp;lt;gfa8lq$olu$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; &quot;Andres&quot; &amp;lt;rantore@werb.deNoRs&amp;gt; wrote in message &amp;lt;gfa45i$kk7$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; I'll try to send you a sample file to play with if you're curious.&lt;br&gt;
&amp;gt; -DS&lt;br&gt;
&lt;br&gt;
That would be nice, thanks. I hope you can decipher my e-mail address (leave out any 'r', end with .de). Btw. I wonder where the 'Contact Author' button in the file exchange has gone...</description>
    </item>
  </channel>
</rss>

