<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/239066</link>
    <title>MATLAB Central Newsreader - how to read and write a portion of a file</title>
    <description>Feed for thread: how to read and write a portion of a file</description>
    <language>en-us</language>
    <copyright>&amp;copy;1994-2012 by MathWorks, Inc.</copyright>
    <webmaster>webmaster@mathworks.com</webmaster>
    <generator>MATLAB Central Newsreader</generator>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <ttl>60</ttl>
    <image>
      <title>MathWorks</title>
      <url>http://www.mathworks.com/images/membrane_icon.gif</url>
    </image>
    <item>
      <pubDate>Tue, 11 Nov 2008 18:35:02 -0500</pubDate>
      <title>how to read and write a portion of a file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/239066#610346</link>
      <author>Zahra </author>
      <description>Hi all,&lt;br&gt;
&lt;br&gt;
I have a large ascii file. The file contains data from different runs. At the begining of each run there are the same number of header lines. But the number of data point may vary for each run. What I need to do is to write the data for a range of specific runs in the exact same format of the original ascii file. Here I am giving an example of the original file that contains 3 runs:&lt;br&gt;
-----------------------------------------&lt;br&gt;
Run 1&lt;br&gt;
header line 1&lt;br&gt;
header line 2&lt;br&gt;
&lt;br&gt;
Point zeta eta gamma delta&lt;br&gt;
Point sig1 sig2&lt;br&gt;
1 1 2 3 4&lt;br&gt;
1 5 6&lt;br&gt;
2 1 2 3 7&lt;br&gt;
2 5 7&lt;br&gt;
&lt;br&gt;
Run 2&lt;br&gt;
header line 1&lt;br&gt;
header line 2&lt;br&gt;
&lt;br&gt;
Point zeta eta gamma delta&lt;br&gt;
Point sig1 sig2&lt;br&gt;
1 1 2 3 8&lt;br&gt;
1 5 8&lt;br&gt;
2 1 2 3 9&lt;br&gt;
2 5 9&lt;br&gt;
&lt;br&gt;
Run 3&lt;br&gt;
header line 1&lt;br&gt;
header line 2&lt;br&gt;
&lt;br&gt;
Point zeta eta gamma delta&lt;br&gt;
Point sig1 sig2&lt;br&gt;
1 1 2 3 8&lt;br&gt;
1 5 8&lt;br&gt;
2 1 2 3 9&lt;br&gt;
2 5 9&lt;br&gt;
----------------------------------------------------&lt;br&gt;
For example how do I write only the information for the run 2 (as shown below)  in a separate ascii file but with the exact same formating:&lt;br&gt;
&lt;br&gt;
Run 2&lt;br&gt;
header line 1&lt;br&gt;
header line 2&lt;br&gt;
&lt;br&gt;
Point zeta eta gamma delta&lt;br&gt;
Point sig1 sig2&lt;br&gt;
1 1 2 3 8&lt;br&gt;
1 5 8&lt;br&gt;
2 1 2 3 9&lt;br&gt;
2 5 9&lt;br&gt;
&lt;br&gt;
Can any one help please?&lt;br&gt;
&lt;br&gt;
Thanks,&lt;br&gt;
Zahra</description>
    </item>
    <item>
      <pubDate>Tue, 11 Nov 2008 21:50:18 -0500</pubDate>
      <title>Re: how to read and write a portion of a file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/239066#610379</link>
      <author>Negar </author>
      <description>Hi,&lt;br&gt;
&lt;br&gt;
I am not sure if this is an answer to what you're asking for but as a suggestion, try the textscan function in MATLAB.  Read the Help documentation on this function. The 'HeaderLines' parameter under User Configurable Options (in the help page of textscan function) may be helpful to you. &lt;br&gt;
&lt;br&gt;
Regards,&lt;br&gt;
Negar&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&quot;Zahra&quot; &amp;lt;zahra.yamani@nrc.gc.ca&amp;gt; wrote in message &amp;lt;gfcj8m$ksi$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; Hi all,&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; I have a large ascii file. The file contains data from different runs. At the begining of each run there are the same number of header lines. But the number of data point may vary for each run. What I need to do is to write the data for a range of specific runs in the exact same format of the original ascii file. Here I am giving an example of the original file that contains 3 runs:&lt;br&gt;
&amp;gt; -----------------------------------------&lt;br&gt;
&amp;gt; Run 1&lt;br&gt;
&amp;gt; header line 1&lt;br&gt;
&amp;gt; header line 2&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Point zeta eta gamma delta&lt;br&gt;
&amp;gt; Point sig1 sig2&lt;br&gt;
&amp;gt; 1 1 2 3 4&lt;br&gt;
&amp;gt; 1 5 6&lt;br&gt;
&amp;gt; 2 1 2 3 7&lt;br&gt;
&amp;gt; 2 5 7&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Run 2&lt;br&gt;
&amp;gt; header line 1&lt;br&gt;
&amp;gt; header line 2&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Point zeta eta gamma delta&lt;br&gt;
&amp;gt; Point sig1 sig2&lt;br&gt;
&amp;gt; 1 1 2 3 8&lt;br&gt;
&amp;gt; 1 5 8&lt;br&gt;
&amp;gt; 2 1 2 3 9&lt;br&gt;
&amp;gt; 2 5 9&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Run 3&lt;br&gt;
&amp;gt; header line 1&lt;br&gt;
&amp;gt; header line 2&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Point zeta eta gamma delta&lt;br&gt;
&amp;gt; Point sig1 sig2&lt;br&gt;
&amp;gt; 1 1 2 3 8&lt;br&gt;
&amp;gt; 1 5 8&lt;br&gt;
&amp;gt; 2 1 2 3 9&lt;br&gt;
&amp;gt; 2 5 9&lt;br&gt;
&amp;gt; ----------------------------------------------------&lt;br&gt;
&amp;gt; For example how do I write only the information for the run 2 (as shown below)  in a separate ascii file but with the exact same formating:&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Run 2&lt;br&gt;
&amp;gt; header line 1&lt;br&gt;
&amp;gt; header line 2&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Point zeta eta gamma delta&lt;br&gt;
&amp;gt; Point sig1 sig2&lt;br&gt;
&amp;gt; 1 1 2 3 8&lt;br&gt;
&amp;gt; 1 5 8&lt;br&gt;
&amp;gt; 2 1 2 3 9&lt;br&gt;
&amp;gt; 2 5 9&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Can any one help please?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Thanks,&lt;br&gt;
&amp;gt; Zahra</description>
    </item>
    <item>
      <pubDate>Tue, 11 Nov 2008 23:56:03 -0500</pubDate>
      <title>Re: how to read and write a portion of a file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/239066#610393</link>
      <author>Zahra </author>
      <description>Hi Negar,&lt;br&gt;
Thanks. I have now learned about readline.m (from file exchange). The only issue that is remained is the fact that with readline.m one has to give the specific line numbers that you want to read from the file. So now my problem has changed to how to identify the line numbers in the file. I know for example what string I need to look for in that specific line but do not know how to determine its line number in the file. Any thouight as how one can determine a specific line number in an ascii file?&lt;br&gt;
&lt;br&gt;
Thanks again,&lt;br&gt;
Zahra</description>
    </item>
    <item>
      <pubDate>Sun, 16 Nov 2008 03:19:01 -0500</pubDate>
      <title>Re: how to read and write a portion of a file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/239066#611112</link>
      <author>Zahra </author>
      <description>Hi all,&lt;br&gt;
&lt;br&gt;
I have now been able to write the follwoing code to read the file and find the runs that I am looking for. The only problem is that the data file is really large it has tens of thousands of lines and because of the for loop in the code, it takes too long. Can any one suggest another method to speed things up?&lt;br&gt;
------------------------------------&lt;br&gt;
fid=fopen(filename,'r');   &lt;br&gt;
AllString=textscan(fid,'%s','delimiter','\n');&lt;br&gt;
CharString=cellstr(AllString{1});&lt;br&gt;
foundrun=0;&lt;br&gt;
data=0;&lt;br&gt;
datastr=[];&lt;br&gt;
for run=runi:runf% runi is the initial run number and runf is the final run number of the %data runs I am interested to read&lt;br&gt;
string1='Run';&lt;br&gt;
RunString=sprintf('%6d',run);&lt;br&gt;
SearchString=strcat(string1,RunString);&lt;br&gt;
for i=1:length(CharString)&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if length(strfind(CharString{i},SearchString)) ~= 0&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;foundrun=1;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;end&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if ((data==2) &amp; (length(CharString{i})==0))&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;data=0;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;foundrun=0;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;elseif ((data==2) &amp; (length(CharString{i})~=0))&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;datastr=[datastr;CharString(i)];&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;end&lt;br&gt;
end&lt;br&gt;
end&lt;br&gt;
fclose(fid);&lt;br&gt;
------------------------------------------&lt;br&gt;
Please see my original message below to see what is the format of the data file.&lt;br&gt;
&lt;br&gt;
Any advice is appreciated.&lt;br&gt;
Thanks,&lt;br&gt;
Zahra&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&quot;Zahra&quot; &amp;lt;zahra.yamani@nrc.gc.ca&amp;gt; wrote in message &amp;lt;gfcj8m$ksi$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; Hi all,&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; I have a large ascii file. The file contains data from different runs. At the begining of each run there are the same number of header lines. But the number of data point may vary for each run. What I need to do is to write the data for a range of specific runs in the exact same format of the original ascii file. Here I am giving an example of the original file that contains 3 runs:&lt;br&gt;
&amp;gt; -----------------------------------------&lt;br&gt;
&amp;gt; Run 1&lt;br&gt;
&amp;gt; header line 1&lt;br&gt;
&amp;gt; header line 2&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Point zeta eta gamma delta&lt;br&gt;
&amp;gt; Point sig1 sig2&lt;br&gt;
&amp;gt; 1 1 2 3 4&lt;br&gt;
&amp;gt; 1 5 6&lt;br&gt;
&amp;gt; 2 1 2 3 7&lt;br&gt;
&amp;gt; 2 5 7&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Run 2&lt;br&gt;
&amp;gt; header line 1&lt;br&gt;
&amp;gt; header line 2&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Point zeta eta gamma delta&lt;br&gt;
&amp;gt; Point sig1 sig2&lt;br&gt;
&amp;gt; 1 1 2 3 8&lt;br&gt;
&amp;gt; 1 5 8&lt;br&gt;
&amp;gt; 2 1 2 3 9&lt;br&gt;
&amp;gt; 2 5 9&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Run 3&lt;br&gt;
&amp;gt; header line 1&lt;br&gt;
&amp;gt; header line 2&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Point zeta eta gamma delta&lt;br&gt;
&amp;gt; Point sig1 sig2&lt;br&gt;
&amp;gt; 1 1 2 3 8&lt;br&gt;
&amp;gt; 1 5 8&lt;br&gt;
&amp;gt; 2 1 2 3 9&lt;br&gt;
&amp;gt; 2 5 9&lt;br&gt;
&amp;gt; ----------------------------------------------------&lt;br&gt;
&amp;gt; For example how do I write only the information for the run 2 (as shown below)  in a separate ascii file but with the exact same formating:&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Run 2&lt;br&gt;
&amp;gt; header line 1&lt;br&gt;
&amp;gt; header line 2&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Point zeta eta gamma delta&lt;br&gt;
&amp;gt; Point sig1 sig2&lt;br&gt;
&amp;gt; 1 1 2 3 8&lt;br&gt;
&amp;gt; 1 5 8&lt;br&gt;
&amp;gt; 2 1 2 3 9&lt;br&gt;
&amp;gt; 2 5 9&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Can any one help please?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Thanks,&lt;br&gt;
&amp;gt; Zahra</description>
    </item>
    <item>
      <pubDate>Sun, 16 Nov 2008 18:52:02 -0500</pubDate>
      <title>Re: how to read and write a portion of a file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/239066#611176</link>
      <author>Andres </author>
      <description>Hi Zahra,&lt;br&gt;
simply speaking, you want to copy a part of the file to a new file, right? &lt;br&gt;
Have you thought of&lt;br&gt;
&lt;br&gt;
1. reading in the whole file with fread into a character array (some 10000 lines does not sound _too_ large to me)&lt;br&gt;
2. find the indices for [char(10), 'Run 1000' char(13)] and [char(10), 'Run 2000' char(13)] (adjust for your exact format, line break characters, and boundary numbers)&lt;br&gt;
3. copy the characters of interest by the help of those indices and write them to the new file&lt;br&gt;
&lt;br&gt;
Doing so would avoid any for loop and text conversion.&lt;br&gt;
&lt;br&gt;
You could gain speed and make this suitable for really large files by analyzing only smaller portions of the file, guessing where your lines of interest are and iterating towards them (which would use a while loop most probably and fseek to navigate through the file).&lt;br&gt;
Maybe this is a viable solution.&lt;br&gt;
Hth&lt;br&gt;
Andres</description>
    </item>
    <item>
      <pubDate>Mon, 17 Nov 2008 23:18:02 -0500</pubDate>
      <title>Re: how to read and write a portion of a file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/239066#611386</link>
      <author>Zahra </author>
      <description>Hi Andres,&lt;br&gt;
&lt;br&gt;
Thanks for your suggestion. I now have written the follwoing code (with the help of readline.m from file exchange) and for the same data file the time that it takes is almost 1/10 of my original code with a for loop. The first part of the code is to determine the line number for the interested runs and the last part uses the readline to read this portion of data file and then I used dlmwrite to write what was read exactly with the same format. Would be great if still posiible to make it faster. Any suggestion is greatly appreciated. Thanks. Zahra&lt;br&gt;
&lt;br&gt;
filename= input('Enter the name of data file:');&lt;br&gt;
scan1= input('Enter the initial run number:');                  &lt;br&gt;
scan2= input('Enter the final run number:');&lt;br&gt;
scan2=scan2+1;&lt;br&gt;
string1='Run';&lt;br&gt;
RunS1=sprintf('%6d',scan1);&lt;br&gt;
searchS1=strcat(string1,RunS1);&lt;br&gt;
RunS2=sprintf('%6d',scan2);&lt;br&gt;
searchS2=strcat(string1,RunS2);&lt;br&gt;
fid=fopen(filename,'r');&lt;br&gt;
totalstring=textscan(fid,'%s','delimiter','\n');&lt;br&gt;
stringchar=cellstr(totalstring{1});&lt;br&gt;
totalnLines=length(stringchar);&lt;br&gt;
for i=1:length(stringchar)&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;if length(strfind(stringchar{i},searchS1)) ~= 0&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;foundrun=1;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;Lscan1=i;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;end&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;if length(strfind(stringchar{i},searchS2)) ~= 0&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;foundrun=1;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;Lscan2=i;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;end&lt;br&gt;
end&lt;br&gt;
Lines=[Lscan1:Lscan2-1];&lt;br&gt;
All=readline(filename,Lines,1);&lt;br&gt;
dlmwrite('temp.dat', All, '')</description>
    </item>
    <item>
      <pubDate>Tue, 18 Nov 2008 09:27:01 -0500</pubDate>
      <title>Re: how to read and write a portion of a file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/239066#611446</link>
      <author>Andres </author>
      <description>&quot;Zahra&quot; &amp;lt;zahra.yamani@nrc.gc.ca&amp;gt; wrote in message &amp;lt;gfsu3a$slj$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; Hi Andres,&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Thanks for your suggestion. I now have written the follwoing code (with the help of readline.m from file exchange) and for the same data file the time that it takes is almost 1/10 of my original code with a for loop. [..]&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
Hi Zahra,&lt;br&gt;
based on your latest code, I've just quickly coded what I suggested above.&lt;br&gt;
Some remarks:&lt;br&gt;
- consider using uigetfile and uiputfile for your input and output files&lt;br&gt;
- I chose to always overwrite 'temp.dat' - if you want to append, use 'a' instead of 'w' in fopen&lt;br&gt;
- unfortunately, multiple space characters are not displayed correctly in the matlab central newsreader, but I assume your &quot;RunS1=sprintf('%6d',scan1);&quot; assignment is correct&lt;br&gt;
&lt;br&gt;
Please check for yourself if the execution time is improved again (what are those times btw?) As I noted, this can be further optimized for very large files with only a few lines to be extracted, but I hope this will do.&lt;br&gt;
Regards&lt;br&gt;
Andres&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
% get user input&lt;br&gt;
filename= input('Enter the name of data file:');&lt;br&gt;
scan1   = input('Enter the initial run number:'); &lt;br&gt;
scanEnd = input('Enter the final run number:');&lt;br&gt;
% compose strings to search for&lt;br&gt;
scan2=scanEnd+1;&lt;br&gt;
string1='Run';&lt;br&gt;
RunS1=sprintf('%6d',scan1);&lt;br&gt;
searchS1=strcat(string1,RunS1);&lt;br&gt;
RunS2=sprintf('%6d',scan2);&lt;br&gt;
searchS2=strcat(string1,RunS2);&lt;br&gt;
% open file for reading&lt;br&gt;
fid=fopen(filename,'r');&lt;br&gt;
totalstring = fread(fid, '*char').';&lt;br&gt;
fclose(fid);&lt;br&gt;
% determine string positions&lt;br&gt;
startIndex = strfind(totalstring, searchS1);&lt;br&gt;
stopIndex  = strfind(totalstring, searchS2)-1;&lt;br&gt;
if isempty(stopIndex) %searched number may exceed final run number of the file&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;stopIndex = numel(totalstring);&lt;br&gt;
end&lt;br&gt;
% overwrite &quot;temp.dat&quot; with the desired part of the file&lt;br&gt;
fid = fopen('temp.dat', 'w');&lt;br&gt;
fwrite(fid, totalstring(startIndex:stopIndex));&lt;br&gt;
fclose(fid);</description>
    </item>
    <item>
      <pubDate>Tue, 18 Nov 2008 13:40:17 -0500</pubDate>
      <title>Re: how to read and write a portion of a file</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/239066#611485</link>
      <author>Zahra </author>
      <description>Hi Andres,&lt;br&gt;
&lt;br&gt;
Thanks very much for your reply. Wow, your code is super fast, for the same data file here are the times:&lt;br&gt;
1. for my original code with the for loop: 48 sec&lt;br&gt;
2. for my second code based on string search and readline.m: 7 sec&lt;br&gt;
3. for your code: 0.1 sec&lt;br&gt;
&lt;br&gt;
Your code will speed up my data analysis by a lot. Thanks again for all your help.&lt;br&gt;
Best regards,&lt;br&gt;
Zahra</description>
    </item>
  </channel>
</rss>

