<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/264277</link>
    <title>MATLAB Central Newsreader - read large text files</title>
    <description>Feed for thread: read large text files</description>
    <language>en-us</language>
    <copyright>&amp;copy;1994-2012 by MathWorks, Inc.</copyright>
    <webmaster>webmaster@mathworks.com</webmaster>
    <generator>MATLAB Central Newsreader</generator>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <ttl>60</ttl>
    <image>
      <title>MathWorks</title>
      <url>http://www.mathworks.com/images/membrane_icon.gif</url>
    </image>
    <item>
      <pubDate>Wed, 28 Oct 2009 01:14:04 -0400</pubDate>
      <title>read large text files</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/264277#690170</link>
      <author>Anandhi </author>
      <description>Hi ,&lt;br&gt;
&lt;br&gt;
I have text files having 6 columns of data, but the number of rows is greater than 100000. I do not know the exact row number.&lt;br&gt;
&lt;br&gt;
When I use this prog I am able to get upto 100000 rows. How to get the rows beyond this till the end of file?&lt;br&gt;
&lt;br&gt;
block_size = 100000;&lt;br&gt;
format = '%f %f %f %f %f %f';&lt;br&gt;
file_id = fopen(fno{i});&lt;br&gt;
cnt=0;&lt;br&gt;
segarray = textscan(file_id, format, block_size); &lt;br&gt;
&lt;br&gt;
thanks in advance for the support&lt;br&gt;
&lt;br&gt;
anandhi</description>
    </item>
    <item>
      <pubDate>Wed, 28 Oct 2009 01:54:03 -0400</pubDate>
      <title>Re: read large text files</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/264277#690178</link>
      <author>dpb</author>
      <description>Anandhi wrote:&lt;br&gt;
...&lt;br&gt;
&amp;gt; When I use this prog I am able to get upto 100000 rows. How to get&lt;br&gt;
&amp;gt; the rows beyond this till the end of file?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; block_size = 100000;&lt;br&gt;
&amp;gt; format = '%f %f %f %f %f %f';&lt;br&gt;
&amp;gt; file_id = fopen(fno{i});&lt;br&gt;
&amp;gt; cnt=0;&lt;br&gt;
&amp;gt; segarray = textscan(file_id, format, block_size); &lt;br&gt;
...&lt;br&gt;
&lt;br&gt;
Don't specify N and textscan() should read to EOF&lt;br&gt;
&lt;br&gt;
Alternatively, see&lt;br&gt;
&lt;br&gt;
doc textscan&lt;br&gt;
&lt;br&gt;
and note one can call textscan repeatedly on the same fid and continue &lt;br&gt;
from where left off.&lt;br&gt;
&lt;br&gt;
Doc doesn't indicate it, but N=-1 in textread() is a flag for &quot;read to &lt;br&gt;
end of file&quot;; one would presume that would have been implemented in &lt;br&gt;
textscan() as well.  Also, I'd presume inf would have the same effect. &lt;br&gt;
I can't test these hypotheses as my version predates textscan().&lt;br&gt;
&lt;br&gt;
--</description>
    </item>
    <item>
      <pubDate>Wed, 28 Oct 2009 03:39:18 -0400</pubDate>
      <title>Re: read large text files</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/264277#690194</link>
      <author>anandhi</author>
      <description>On Oct 27, 9:54&#160;pm, dpb &amp;lt;n...@non.net&amp;gt; wrote:&lt;br&gt;
&amp;gt; Anandhi wrote:&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; ...&amp;gt; When I use this prog I am able to get upto 100000 rows. How to get&lt;br&gt;
&amp;gt; &amp;gt; the rows beyond this till the end of file?&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; block_size = 100000;&lt;br&gt;
&amp;gt; &amp;gt; format = '%f %f %f %f %f %f';&lt;br&gt;
&amp;gt; &amp;gt; file_id = fopen(fno{i});&lt;br&gt;
&amp;gt; &amp;gt; cnt=0;&lt;br&gt;
&amp;gt; &amp;gt; segarray = textscan(file_id, format, block_size);&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; ...&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; Don't specify N and textscan() should read to EOF&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; Alternatively, see&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; doc textscan&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; and note one can call textscan repeatedly on the same fid and continue&lt;br&gt;
&amp;gt; from where left off.&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; Doc doesn't indicate it, but N=-1 in textread() is a flag for &quot;read to&lt;br&gt;
&amp;gt; end of file&quot;; one would presume that would have been implemented in&lt;br&gt;
&amp;gt; textscan() as well. &#160;Also, I'd presume inf would have the same effect.&lt;br&gt;
&amp;gt; I can't test these hypotheses as my version predates textscan().&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; --&lt;br&gt;
&lt;br&gt;
Thanks for the response, however&lt;br&gt;
&lt;br&gt;
when i call textscan repeatedly on the same fid and continue&lt;br&gt;
&amp;nbsp;it does continue upto 100000 lines only after which it does not&lt;br&gt;
continue.&lt;br&gt;
&lt;br&gt;
eg the file has 1179919 lines&lt;br&gt;
&lt;br&gt;
segarray = textscan(file_id, format);&lt;br&gt;
segarray1 = textscan(file_id, format);&lt;br&gt;
&lt;br&gt;
I still get the size of segarray1 empty</description>
    </item>
    <item>
      <pubDate>Wed, 28 Oct 2009 04:04:56 -0400</pubDate>
      <title>Re: read large text files</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/264277#690206</link>
      <author>Praetorian</author>
      <description>On Oct 27, 9:39&#160;pm, anandhi &amp;lt;anandhi.san...@gmail.com&amp;gt; wrote:&lt;br&gt;
&amp;gt; On Oct 27, 9:54&#160;pm, dpb &amp;lt;n...@non.net&amp;gt; wrote:&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; Anandhi wrote:&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; ...&amp;gt; When I use this prog I am able to get upto 100000 rows. How to get&lt;br&gt;
&amp;gt; &amp;gt; &amp;gt; the rows beyond this till the end of file?&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; &amp;gt; block_size = 100000;&lt;br&gt;
&amp;gt; &amp;gt; &amp;gt; format = '%f %f %f %f %f %f';&lt;br&gt;
&amp;gt; &amp;gt; &amp;gt; file_id = fopen(fno{i});&lt;br&gt;
&amp;gt; &amp;gt; &amp;gt; cnt=0;&lt;br&gt;
&amp;gt; &amp;gt; &amp;gt; segarray = textscan(file_id, format, block_size);&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; ...&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; Don't specify N and textscan() should read to EOF&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; Alternatively, see&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; doc textscan&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; and note one can call textscan repeatedly on the same fid and continue&lt;br&gt;
&amp;gt; &amp;gt; from where left off.&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; Doc doesn't indicate it, but N=-1 in textread() is a flag for &quot;read to&lt;br&gt;
&amp;gt; &amp;gt; end of file&quot;; one would presume that would have been implemented in&lt;br&gt;
&amp;gt; &amp;gt; textscan() as well. &#160;Also, I'd presume inf would have the same effect.&lt;br&gt;
&amp;gt; &amp;gt; I can't test these hypotheses as my version predates textscan().&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; &amp;gt; --&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; Thanks for the response, however&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; when i call textscan repeatedly on the same fid and continue&lt;br&gt;
&amp;gt; &#160;it does continue upto 100000 lines only after which it does not&lt;br&gt;
&amp;gt; continue.&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; eg the file has 1179919 lines&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; segarray = textscan(file_id, format);&lt;br&gt;
&amp;gt; segarray1 = textscan(file_id, format);&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; I still get the size of segarray1 empty&lt;br&gt;
&lt;br&gt;
You could try using my CSVIMPORT submission from FEX (&lt;a href=&quot;http://&quot;&gt;http://&lt;/a&gt;&lt;br&gt;
tinyurl.com/yjctr57).&lt;br&gt;
&lt;br&gt;
HTH,&lt;br&gt;
Ashish.</description>
    </item>
    <item>
      <pubDate>Wed, 28 Oct 2009 06:32:03 -0400</pubDate>
      <title>Re: read large text files</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/264277#690212</link>
      <author>Rune Allnor</author>
      <description>On 28 Okt, 02:14, &quot;Anandhi &quot; &amp;lt;anan...@mathworks.com&amp;gt; wrote:&lt;br&gt;
&amp;gt; Hi ,&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; I have text files having 6 columns of data, but the number of rows is greater than 100000. I do not know the exact row number.&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; When I use this prog I am able to get upto 100000 rows. How to get the rows beyond this till the end of file?&lt;br&gt;
&lt;br&gt;
This is a trivial programming exercise on buffered I/O:&lt;br&gt;
&lt;br&gt;
1) Decide on a buffer size&lt;br&gt;
2) Clear block, set block ponter to star of buffer&lt;br&gt;
3) Read till buffer is full or EOF is found&lt;br&gt;
4) Process data&lt;br&gt;
5) If EOF not yet found, repeat from 2)&lt;br&gt;
&lt;br&gt;
Rune</description>
    </item>
    <item>
      <pubDate>Wed, 28 Oct 2009 13:45:32 -0400</pubDate>
      <title>Re: read large text files</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/264277#690300</link>
      <author>dpb</author>
      <description>anandhi wrote:&lt;br&gt;
...&lt;br&gt;
&amp;gt; when i call textscan repeatedly on the same fid and continue&lt;br&gt;
&amp;gt;  it does continue upto 100000 lines only after which it does not&lt;br&gt;
&amp;gt; continue.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; eg the file has 1179919 lines&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; segarray = textscan(file_id, format);&lt;br&gt;
&amp;gt; segarray1 = textscan(file_id, format);&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; I still get the size of segarray1 empty&lt;br&gt;
&lt;br&gt;
I'd suspect there's a problem in the file at that point then.  From&lt;br&gt;
Remarks section in documentation--&lt;br&gt;
&lt;br&gt;
&quot;When textscan reads a specified file or string, it attempts to match &lt;br&gt;
the data to the format string. If textscan fails to convert a data &lt;br&gt;
field, it stops reading and returns all fields read before the failure.&quot;&lt;br&gt;
&lt;br&gt;
Perhaps during your experimenting you accidentally wrote an EOF or some &lt;br&gt;
other data to the file???&lt;br&gt;
&lt;br&gt;
I'd suggest using a text-listing/viewing tool to verify the file is, &lt;br&gt;
indeed, still pristine (my hunch is you'll find it isn't).&lt;br&gt;
&lt;br&gt;
--</description>
    </item>
  </channel>
</rss>

