<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261014</link>
    <title>MATLAB Central Newsreader - Reading textfile</title>
    <description>Feed for thread: Reading textfile</description>
    <language>en-us</language>
    <copyright>&amp;copy;1994-2012 by MathWorks, Inc.</copyright>
    <webmaster>webmaster@mathworks.com</webmaster>
    <generator>MATLAB Central Newsreader</generator>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <ttl>60</ttl>
    <image>
      <title>MathWorks</title>
      <url>http://www.mathworks.com/images/membrane_icon.gif</url>
    </image>
    <item>
      <pubDate>Thu, 17 Sep 2009 00:01:03 -0400</pubDate>
      <title>Reading textfile</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261014#680700</link>
      <author>Bryan Heit</author>
      <description>I am having trouble reading in a text file.  What I want is to&lt;br&gt;
generate an array of strins, 1 column wide by as many rows long as&lt;br&gt;
there is lines in the dataset.  The dataset is an HTML page saved as&lt;br&gt;
text, containing bioinformatic information.  I'm working on a script&lt;br&gt;
that'll pull specific species data out of the dataset, but cannot make&lt;br&gt;
much progress.  I've tried several ways of reading the data&lt;br&gt;
(importdata, textscan, etc) to no avail.  At best the first 4-5 lines&lt;br&gt;
get read in, then the read process is terminated (there are thousands&lt;br&gt;
of lines).  The data itself looks as follows:&lt;br&gt;
&lt;br&gt;
--------------------------------------------------------------------------------&lt;br&gt;
NPSA gnl|sp|P0C9I2  (1107L_ASFK5) Protein MGF 110-7L OS=African swine&lt;br&gt;
fever virus (isolate Pig/Kenya/KEN-50/1950) GN=Ken-016 PE=3 SV=1&lt;br&gt;
&lt;br&gt;
*****&#8250; PATTERN 1&lt;br&gt;
&amp;nbsp;Site :    56-   64, Identity&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;tyvescrfcw_DCEDGVCTS_riwgnnstsi&lt;br&gt;
--------------------------------------------------------------------------------&lt;br&gt;
NPSA gnl|sp|P0C9I3  (1107L_ASFM2) Protein MGF 110-7L OS=African swine&lt;br&gt;
fever virus (isolate Tick/Malawi/Lil 20-1/1983) GN=Mal-013 PE=3 SV=1&lt;br&gt;
&lt;br&gt;
*****&#8250; PATTERN 1&lt;br&gt;
&amp;nbsp;Site :    56-   64, Identity&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;tyvescrfcw_DCEDGVCTS_rvwgnnstsi&lt;br&gt;
--------------------------------------------------------------------------------&lt;br&gt;
NPSA gnl|sp|P0C9I4  (1107L_ASFP4) Protein MGF 110-7L OS=African swine&lt;br&gt;
fever virus (isolate Tick/South Africa/Pretoriuskop Pr4/1996)&lt;br&gt;
GN=Pret-017 PE=3 SV=1&lt;br&gt;
&lt;br&gt;
*****&#8250; PATTERN 1&lt;br&gt;
&amp;nbsp;Site :    56-   64, Identity&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;tyvescrfcw_DCEDGICTS_rvwgnnstsi&lt;br&gt;
--------------------------------------------------------------------------------&lt;br&gt;
&lt;br&gt;
This goes on and on - I would like to read every line; even the&lt;br&gt;
'-----' ones and blank ones, into the data array.&lt;br&gt;
&lt;br&gt;
Any help would be greatly appreciated.&lt;br&gt;
&lt;br&gt;
Bryan</description>
    </item>
    <item>
      <pubDate>Thu, 17 Sep 2009 00:12:02 -0400</pubDate>
      <title>Re: Reading textfile</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261014#680703</link>
      <author>dpb</author>
      <description>Bryan Heit wrote:&lt;br&gt;
&amp;gt; I am having trouble reading in a text file.  What I want is to&lt;br&gt;
&amp;gt; generate an array of strins, 1 column wide by as many rows long as&lt;br&gt;
&amp;gt; there is lines in the dataset.  ...&lt;br&gt;
&amp;gt; This goes on and on - I would like to read every line; even the&lt;br&gt;
&amp;gt; '-----' ones and blank ones, into the data array.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Any help would be greatly appreciated.&lt;br&gt;
&lt;br&gt;
doc fgetl&lt;br&gt;
&lt;br&gt;
--</description>
    </item>
    <item>
      <pubDate>Thu, 17 Sep 2009 07:45:25 -0400</pubDate>
      <title>Re: Reading textfile</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261014#680749</link>
      <author>Rune Allnor</author>
      <description>On 17 Sep, 02:01, Bryan Heit &amp;lt;bryans.spam.t...@gmail.com&amp;gt; wrote:&lt;br&gt;
&amp;gt; I am having trouble reading in a text file. &#160;What I want is to&lt;br&gt;
&amp;gt; generate an array of strins, 1 column wide by as many rows long as&lt;br&gt;
&amp;gt; there is lines in the dataset. &#160;The dataset is an HTML page saved as&lt;br&gt;
&amp;gt; text, containing bioinformatic information. &#160;I'm working on a script&lt;br&gt;
&amp;gt; that'll pull specific species data out of the dataset, but cannot make&lt;br&gt;
&amp;gt; much progress. &#160;I've tried several ways of reading the data&lt;br&gt;
&amp;gt; (importdata, textscan, etc) to no avail. &#160;At best the first 4-5 lines&lt;br&gt;
&amp;gt; get read in, then the read process is terminated (there are thousands&lt;br&gt;
&amp;gt; of lines). &#160;The data itself looks as follows:&lt;br&gt;
...&lt;br&gt;
&amp;gt; Any help would be greatly appreciated.&lt;br&gt;
&lt;br&gt;
You will have to write your own parser from scratch.&lt;br&gt;
&lt;br&gt;
You should take some time to find out exactly what you&lt;br&gt;
want to use these data for, and how, and come up with a&lt;br&gt;
data structure that fits this use.&lt;br&gt;
&lt;br&gt;
Once that's done, scan the file to extract (possibly&lt;br&gt;
multi line) data items. Then scan the lines and extract&lt;br&gt;
whatever data you want. Store the data in structures&lt;br&gt;
or cell arrays.&lt;br&gt;
&lt;br&gt;
My point is that this is a somewhat involved task that&lt;br&gt;
might not be easily solved with canned routines. If you&lt;br&gt;
think the above sounds daunting, find/hire somebody that&lt;br&gt;
can help you - it is a standard programming task that any&lt;br&gt;
computer science student can help with. Expect to spend&lt;br&gt;
a bit of time explaining a helper how to separate the&lt;br&gt;
data, though.&lt;br&gt;
&lt;br&gt;
Rune</description>
    </item>
    <item>
      <pubDate>Thu, 17 Sep 2009 09:39:02 -0400</pubDate>
      <title>Re: Reading textfile</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261014#680773</link>
      <author>Lucio Cetto</author>
      <description>Bryan:&lt;br&gt;
textscan does it; if the file is too large you should increase buffersize, if every record in the file always have the same number of rows you could reshape the output cell array or play a little more with the format string and you will get the data aranged into columns very easily...&lt;br&gt;
&lt;br&gt;
fid = fopen('mcen.txt','r');&lt;br&gt;
strs = textscan(fid,'%s','delimiter','\n')&lt;br&gt;
strs{1}&lt;br&gt;
fclose(fid)&lt;br&gt;
&lt;br&gt;
HTH&lt;br&gt;
Lucio</description>
    </item>
    <item>
      <pubDate>Thu, 17 Sep 2009 15:29:09 -0400</pubDate>
      <title>Re: Reading textfile</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261014#680864</link>
      <author>Bryan</author>
      <description>On Sep 17, 5:39&#160;am, &quot;Lucio Cetto&quot; &amp;lt;lce...@nospam.mathworks.com&amp;gt; wrote:&lt;br&gt;
&amp;gt; Bryan:&lt;br&gt;
&amp;gt; textscan does it; if the file is too large you should increase buffersize, if every record in the file always have the same number of rows you could reshape the output cell array or play a little more with the format string and you will get the data aranged into columns very easily...&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; fid = fopen('mcen.txt','r');&lt;br&gt;
&amp;gt; strs = textscan(fid,'%s','delimiter','\n')&lt;br&gt;
&amp;gt; strs{1}&lt;br&gt;
&amp;gt; fclose(fid)&lt;br&gt;
&amp;gt;&lt;br&gt;
&amp;gt; HTH&lt;br&gt;
&amp;gt; Lucio&lt;br&gt;
&lt;br&gt;
Thanx everyone for your impost.  Lucio, your method worked perfectly -&lt;br&gt;
the array is huge, but it loads the whole file and I can parse it&lt;br&gt;
easily to extract the data I want.&lt;br&gt;
&lt;br&gt;
Once again, thank you everyone.&lt;br&gt;
&lt;br&gt;
Bryan</description>
    </item>
  </channel>
</rss>

