Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
how to read and write a portion of a file

Subject: how to read and write a portion of a file

From: Zahra

Date: 11 Nov, 2008 18:35:02

Message: 1 of 8

Hi all,

I have a large ascii file. The file contains data from different runs. At the begining of each run there are the same number of header lines. But the number of data point may vary for each run. What I need to do is to write the data for a range of specific runs in the exact same format of the original ascii file. Here I am giving an example of the original file that contains 3 runs:
-----------------------------------------
Run 1
header line 1
header line 2

Point zeta eta gamma delta
Point sig1 sig2
1 1 2 3 4
1 5 6
2 1 2 3 7
2 5 7

Run 2
header line 1
header line 2

Point zeta eta gamma delta
Point sig1 sig2
1 1 2 3 8
1 5 8
2 1 2 3 9
2 5 9

Run 3
header line 1
header line 2

Point zeta eta gamma delta
Point sig1 sig2
1 1 2 3 8
1 5 8
2 1 2 3 9
2 5 9
----------------------------------------------------
For example how do I write only the information for the run 2 (as shown below) in a separate ascii file but with the exact same formating:

Run 2
header line 1
header line 2

Point zeta eta gamma delta
Point sig1 sig2
1 1 2 3 8
1 5 8
2 1 2 3 9
2 5 9

Can any one help please?

Thanks,
Zahra

Subject: how to read and write a portion of a file

From: Negar

Date: 11 Nov, 2008 21:50:18

Message: 2 of 8

Hi,

I am not sure if this is an answer to what you're asking for but as a suggestion, try the textscan function in MATLAB. Read the Help documentation on this function. The 'HeaderLines' parameter under User Configurable Options (in the help page of textscan function) may be helpful to you.

Regards,
Negar


"Zahra" <zahra.yamani@nrc.gc.ca> wrote in message <gfcj8m$ksi$1@fred.mathworks.com>...
> Hi all,
>
> I have a large ascii file. The file contains data from different runs. At the begining of each run there are the same number of header lines. But the number of data point may vary for each run. What I need to do is to write the data for a range of specific runs in the exact same format of the original ascii file. Here I am giving an example of the original file that contains 3 runs:
> -----------------------------------------
> Run 1
> header line 1
> header line 2
>
> Point zeta eta gamma delta
> Point sig1 sig2
> 1 1 2 3 4
> 1 5 6
> 2 1 2 3 7
> 2 5 7
>
> Run 2
> header line 1
> header line 2
>
> Point zeta eta gamma delta
> Point sig1 sig2
> 1 1 2 3 8
> 1 5 8
> 2 1 2 3 9
> 2 5 9
>
> Run 3
> header line 1
> header line 2
>
> Point zeta eta gamma delta
> Point sig1 sig2
> 1 1 2 3 8
> 1 5 8
> 2 1 2 3 9
> 2 5 9
> ----------------------------------------------------
> For example how do I write only the information for the run 2 (as shown below) in a separate ascii file but with the exact same formating:
>
> Run 2
> header line 1
> header line 2
>
> Point zeta eta gamma delta
> Point sig1 sig2
> 1 1 2 3 8
> 1 5 8
> 2 1 2 3 9
> 2 5 9
>
> Can any one help please?
>
> Thanks,
> Zahra

Subject: how to read and write a portion of a file

From: Zahra

Date: 11 Nov, 2008 23:56:03

Message: 3 of 8

Hi Negar,
Thanks. I have now learned about readline.m (from file exchange). The only issue that is remained is the fact that with readline.m one has to give the specific line numbers that you want to read from the file. So now my problem has changed to how to identify the line numbers in the file. I know for example what string I need to look for in that specific line but do not know how to determine its line number in the file. Any thouight as how one can determine a specific line number in an ascii file?

Thanks again,
Zahra

Subject: how to read and write a portion of a file

From: Zahra

Date: 16 Nov, 2008 03:19:01

Message: 4 of 8

Hi all,

I have now been able to write the follwoing code to read the file and find the runs that I am looking for. The only problem is that the data file is really large it has tens of thousands of lines and because of the for loop in the code, it takes too long. Can any one suggest another method to speed things up?
------------------------------------
fid=fopen(filename,'r');
AllString=textscan(fid,'%s','delimiter','\n');
CharString=cellstr(AllString{1});
foundrun=0;
data=0;
datastr=[];
for run=runi:runf% runi is the initial run number and runf is the final run number of the %data runs I am interested to read
string1='Run';
RunString=sprintf('%6d',run);
SearchString=strcat(string1,RunString);
for i=1:length(CharString)
    if length(strfind(CharString{i},SearchString)) ~= 0
       foundrun=1;
    end
    if ((data==2) & (length(CharString{i})==0))
        data=0;
        foundrun=0;
    elseif ((data==2) & (length(CharString{i})~=0))
        datastr=[datastr;CharString(i)];
    end
end
end
fclose(fid);
------------------------------------------
Please see my original message below to see what is the format of the data file.

Any advice is appreciated.
Thanks,
Zahra


"Zahra" <zahra.yamani@nrc.gc.ca> wrote in message <gfcj8m$ksi$1@fred.mathworks.com>...
> Hi all,
>
> I have a large ascii file. The file contains data from different runs. At the begining of each run there are the same number of header lines. But the number of data point may vary for each run. What I need to do is to write the data for a range of specific runs in the exact same format of the original ascii file. Here I am giving an example of the original file that contains 3 runs:
> -----------------------------------------
> Run 1
> header line 1
> header line 2
>
> Point zeta eta gamma delta
> Point sig1 sig2
> 1 1 2 3 4
> 1 5 6
> 2 1 2 3 7
> 2 5 7
>
> Run 2
> header line 1
> header line 2
>
> Point zeta eta gamma delta
> Point sig1 sig2
> 1 1 2 3 8
> 1 5 8
> 2 1 2 3 9
> 2 5 9
>
> Run 3
> header line 1
> header line 2
>
> Point zeta eta gamma delta
> Point sig1 sig2
> 1 1 2 3 8
> 1 5 8
> 2 1 2 3 9
> 2 5 9
> ----------------------------------------------------
> For example how do I write only the information for the run 2 (as shown below) in a separate ascii file but with the exact same formating:
>
> Run 2
> header line 1
> header line 2
>
> Point zeta eta gamma delta
> Point sig1 sig2
> 1 1 2 3 8
> 1 5 8
> 2 1 2 3 9
> 2 5 9
>
> Can any one help please?
>
> Thanks,
> Zahra

Subject: how to read and write a portion of a file

From: Andres

Date: 16 Nov, 2008 18:52:02

Message: 5 of 8

Hi Zahra,
simply speaking, you want to copy a part of the file to a new file, right?
Have you thought of

1. reading in the whole file with fread into a character array (some 10000 lines does not sound _too_ large to me)
2. find the indices for [char(10), 'Run 1000' char(13)] and [char(10), 'Run 2000' char(13)] (adjust for your exact format, line break characters, and boundary numbers)
3. copy the characters of interest by the help of those indices and write them to the new file

Doing so would avoid any for loop and text conversion.

You could gain speed and make this suitable for really large files by analyzing only smaller portions of the file, guessing where your lines of interest are and iterating towards them (which would use a while loop most probably and fseek to navigate through the file).
Maybe this is a viable solution.
Hth
Andres

Subject: how to read and write a portion of a file

From: Zahra

Date: 17 Nov, 2008 23:18:02

Message: 6 of 8

Hi Andres,

Thanks for your suggestion. I now have written the follwoing code (with the help of readline.m from file exchange) and for the same data file the time that it takes is almost 1/10 of my original code with a for loop. The first part of the code is to determine the line number for the interested runs and the last part uses the readline to read this portion of data file and then I used dlmwrite to write what was read exactly with the same format. Would be great if still posiible to make it faster. Any suggestion is greatly appreciated. Thanks. Zahra

filename= input('Enter the name of data file:');
scan1= input('Enter the initial run number:');
scan2= input('Enter the final run number:');
scan2=scan2+1;
string1='Run';
RunS1=sprintf('%6d',scan1);
searchS1=strcat(string1,RunS1);
RunS2=sprintf('%6d',scan2);
searchS2=strcat(string1,RunS2);
fid=fopen(filename,'r');
totalstring=textscan(fid,'%s','delimiter','\n');
stringchar=cellstr(totalstring{1});
totalnLines=length(stringchar);
for i=1:length(stringchar)
   if length(strfind(stringchar{i},searchS1)) ~= 0
   foundrun=1;
   Lscan1=i;
   end
   if length(strfind(stringchar{i},searchS2)) ~= 0
   foundrun=1;
   Lscan2=i;
   end
end
Lines=[Lscan1:Lscan2-1];
All=readline(filename,Lines,1);
dlmwrite('temp.dat', All, '')

Subject: how to read and write a portion of a file

From: Andres

Date: 18 Nov, 2008 09:27:01

Message: 7 of 8

"Zahra" <zahra.yamani@nrc.gc.ca> wrote in message <gfsu3a$slj$1@fred.mathworks.com>...
> Hi Andres,
>
> Thanks for your suggestion. I now have written the follwoing code (with the help of readline.m from file exchange) and for the same data file the time that it takes is almost 1/10 of my original code with a for loop. [..]


Hi Zahra,
based on your latest code, I've just quickly coded what I suggested above.
Some remarks:
- consider using uigetfile and uiputfile for your input and output files
- I chose to always overwrite 'temp.dat' - if you want to append, use 'a' instead of 'w' in fopen
- unfortunately, multiple space characters are not displayed correctly in the matlab central newsreader, but I assume your "RunS1=sprintf('%6d',scan1);" assignment is correct

Please check for yourself if the execution time is improved again (what are those times btw?) As I noted, this can be further optimized for very large files with only a few lines to be extracted, but I hope this will do.
Regards
Andres


% get user input
filename= input('Enter the name of data file:');
scan1 = input('Enter the initial run number:');
scanEnd = input('Enter the final run number:');
% compose strings to search for
scan2=scanEnd+1;
string1='Run';
RunS1=sprintf('%6d',scan1);
searchS1=strcat(string1,RunS1);
RunS2=sprintf('%6d',scan2);
searchS2=strcat(string1,RunS2);
% open file for reading
fid=fopen(filename,'r');
totalstring = fread(fid, '*char').';
fclose(fid);
% determine string positions
startIndex = strfind(totalstring, searchS1);
stopIndex = strfind(totalstring, searchS2)-1;
if isempty(stopIndex) %searched number may exceed final run number of the file
    stopIndex = numel(totalstring);
end
% overwrite "temp.dat" with the desired part of the file
fid = fopen('temp.dat', 'w');
fwrite(fid, totalstring(startIndex:stopIndex));
fclose(fid);

Subject: how to read and write a portion of a file

From: Zahra

Date: 18 Nov, 2008 13:40:17

Message: 8 of 8

Hi Andres,

Thanks very much for your reply. Wow, your code is super fast, for the same data file here are the times:
1. for my original code with the for loop: 48 sec
2. for my second code based on string search and readline.m: 7 sec
3. for your code: 0.1 sec

Your code will speed up my data analysis by a lot. Thanks again for all your help.
Best regards,
Zahra

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us