Thread Subject: ascii import

Subject: ascii import

From: Zahra

Date: 10 Nov, 2008 19:32:02

Message: 1 of 6

Hi all,

Can any one help for the following:

I need to import data from an ascii file. The file includes different data runs that are separated by a header (the text of the header changes from one run to the next but the number of header lines remain the same for different runs). Then each run has the same number of data points. The only problem is that for each data point, the data is given in two lines (there is a line folding) starting with the point number in the run. Here is an example of the data file, lets assume there are two runs in it and each run has two points:
--------------------------------
Run 1
header line 1
header line 2
Point1 v1 v2 v3 v4
Point1 v5 v6
Point2 v1' v2' v3' v4'
Point2 v5' v6'

Run 2
header line 1
header line 2
Point1 vv1 vv2 vv3 vv4
Point1 vv5 vv6
Point2 vv1' vv2' vv3' vv4'
Point2 vv5' vv6'
-------------------------------------------
and so on...

What I need is to remove all the header lines and have a matrix that each of its rows represents a data point and its column are the data values, i.e. the rows of the final matrix need to be in the following format:

row 1=[Point1 v1 v2 v3 v4 v5 v6]
row 2=[Point2 v1' v2' v3' v4' v5' v6']
row 3=[Point1 vv1 vv2 vv3 vv4 vv5 vv6]
row 4=[Point2 vv1' vv2' vv3' vv4' vv5' vv6']

I have tried different options in txt2mat, but with each option there is one problem or another.

Can any one help?

Thanks,
Zahra



Subject: ascii import

From: NZTideMan

Date: 10 Nov, 2008 20:24:14

Message: 2 of 6

On Nov 11, 8:32=A0am, "Zahra" <zahra.yam...@nrc.gc.ca> wrote:
> Hi all,
>
> Can any one help for the following:
>
> I need to import data from an ascii file. The file includes different dat=
a runs that are separated by a header (the text of the header changes from =
one run to the next but the number of header lines remain the same for diff=
erent runs). Then each run has the same number of data points. The only pro=
blem is that for each data point, the data is given in two lines (there is =
a line folding) starting with the point number in the run. Here is an examp=
le of the data file, lets assume there are two runs in it and each run has =
two points:
> --------------------------------
> Run =A0 1 =A0
> header line 1
> header line 2
> Point1 =A0 v1 =A0 v2 =A0 v3 =A0 v4
> Point1 =A0 v5 =A0 v6
> Point2 =A0 v1' =A0 v2' =A0 v3' =A0 v4'
> Point2 =A0 v5' =A0 v6'
>
> Run =A02 =A0
> header line 1
> header line 2
> Point1 =A0 vv1 =A0 vv2 =A0 vv3 =A0 vv4
> Point1 =A0 vv5 =A0 vv6
> Point2 =A0 vv1' =A0 vv2' =A0 vv3' =A0 vv4'
> Point2 =A0 vv5' =A0 vv6'
> -------------------------------------------
> and so on...
>
> What I need is to remove all the header lines and have a matrix that each=
 of its rows represents a data point and its column are the data values, i.=
e. the rows of the final matrix need to be in the following format:
>
> row 1=3D[Point1 =A0 v1 =A0 v2 =A0 v3 =A0 v4 =A0 v5 =A0 v6]
> row 2=3D[Point2 =A0 v1' =A0 v2' =A0 v3' =A0 v4' =A0 v5' =A0 v6']
> row 3=3D[Point1 =A0 vv1 =A0 vv2 =A0 vv3 =A0 vv4 =A0 vv5 =A0 vv6]
> row 4=3D[Point2 =A0 vv1' =A0 vv2' =A0 vv3' =A0 vv4' =A0 vv5' =A0 vv6']
>
> I have tried different options in txt2mat, but with each option there is =
one problem or another.
>
> Can any one help?
>
> Thanks,
> Zahra

I'm not familiar with txt2mat, but this is what I'd do:

fid=3Dfopen(txtfile,'rt');
A=3Dzeros(16,nruns); % Allocate storage for the data
for irun=3D1:nruns % Loop thru the runs
  Run{irun}=3Dfgetl(fid);
  header1{irun}=3Dfgetl(fid);
  header2{irun}=3Dfgetl(fid);
  A(:,irun)=3Dfscanf(fid,'%f',16); % read in the 16 data from each run
  dum=3Dfgetl(fid); % read in blank line and discard
end
fclose(fid);

Now you have to manipulate A into the form you want by eliminating
unnecessary rows, then transposing and reshaping.

Subject: ascii import

From: Andres

Date: 10 Nov, 2008 20:44:01

Message: 3 of 6

"Zahra" <zahra.yamani@nrc.gc.ca> wrote in message <gfa27i$kr6$1@fred.mathworks.com>...
> Hi all,
>
> Can any one help for the following:
>
> I need to import data from an ascii file. The file includes different data runs that are separated by a header (the text of the header changes from one run to the next but the number of header lines remain the same for different runs). Then each run has the same number of data points. The only problem is that for each data point, the data is given in two lines (there is a line folding) starting with the point number in the run. Here is an example of the data file, lets assume there are two runs in it and each run has two points:
> --------------------------------
> Run 1
> header line 1
> header line 2
> Point1 v1 v2 v3 v4
> Point1 v5 v6
> Point2 v1' v2' v3' v4'
> Point2 v5' v6'
>
> Run 2
> header line 1
> header line 2
> Point1 vv1 vv2 vv3 vv4
> Point1 vv5 vv6
> Point2 vv1' vv2' vv3' vv4'
> Point2 vv5' vv6'
> -------------------------------------------
> and so on...
>
> What I need is to remove all the header lines and have a matrix that each of its rows represents a data point and its column are the data values, i.e. the rows of the final matrix need to be in the following format:
>
> row 1=[Point1 v1 v2 v3 v4 v5 v6]
> row 2=[Point2 v1' v2' v3' v4' v5' v6']
> row 3=[Point1 vv1 vv2 vv3 vv4 vv5 vv6]
> row 4=[Point2 vv1' vv2' vv3' vv4' vv5' vv6']
>
.
Let's make a test file:
.
Run 1
header line 1
header line 2
Point1 1 2 3 4
Point1 5 6
Point2 1 2 3 7
Point2 5 7
Run 2
header line 1
header line 2
Point1 1 2 3 8
Point1 5 8
Point2 1 2 3 9
Point2 5 9
.
Two txt2mat alternatives out of the box:
Case 1)
each interim header line can be identified by a set of key phrases, here:
'Run' and 'header'
.
t2mOptions = {
    'NumHeaderLines',3, ...
    'NumColumns',8, ...
    'ReplaceExpr',{{'Point',' '}}, ...
    'BadLineString',{'Run','header'}, ...
    'ReadMode','block', ...
    };
A = txt2mat('zahra.txt',t2mOptions{:});
.
gives:
     1 1 2 3 4 1 5 6
     2 1 2 3 7 2 5 7
     1 1 2 3 8 1 5 8
     2 1 2 3 9 2 5 9
Note that the point number is repeated in the 6th column, you'll probably want to delete this column.
.
Case 2)
the interim header lines can be not identified by any special phrase or character
.
t2mOptions = {
    'NumHeaderLines',3, ...
    'NumColumns',-1, ...
    'ReplaceExpr',{{'Point',' '}}, ...
    'ReadMode','line', ...
    };
B = txt2mat('zahra.txt',t2mOptions{:});
.
gives
     1 1 2 3 4
     1 5 6 NaN NaN
     2 1 2 3 7
     2 5 7 NaN NaN
   NaN NaN NaN NaN NaN
   NaN NaN NaN NaN NaN
   NaN NaN NaN NaN NaN
     1 1 2 3 8
     1 5 8 NaN NaN
     2 1 2 3 9
     2 5 9 NaN NaN
.
this obviously needs further post-processing with elementary matrix indexing operations, but all your numbers are there.
.
I hope you can adopt this to your problem. There are many possibilities without txt2mat of course - as I just read one above.
Regards
Andres

Subject: ascii import

From: Zahra

Date: 10 Nov, 2008 21:30:03

Message: 4 of 6

Hi Andreas,

Thanks for your reply.

I should have been more precise in defining my data files. They actually look like:
--------------------------

Run 1
header line 1
header line 2

Point zeta eta gamma delta
Point sig1 sig2
1 1 2 3 4
1 5 6
2 1 2 3 7
2 5 7

Run 2
header line 1
header line 2

Point zeta eta gamma delta
Point sig1 sig2
1 1 2 3 8
1 5 8
2 1 2 3 9
2 5 9
---------------------------------

i.e. the word "Point" is not repeated at the begining of each data point just in the header line, instead the actual points in the run are recorded as shown above. Will it stil be possible to make the data matrix similar to the data file that you had assumed?

Thanks again.
Zahra


Subject: ascii import

From: Andres

Date: 11 Nov, 2008 08:31:02

Message: 5 of 6

This should just make things easier.
- 1st attempt
you can identify the header lines (e.g. by character 'n' as in 'Run','Line','Point' who must not appear inside the number lines) and you know how many numbers you want to put into a single row of the matrix (8):
.
t2mOptions = {
    'NumHeaderLines',5, ... % 0 would be ok, too
    'NumColumns',8, ...
    'BadLineString',{'n'}, ... % or e.g. {'Run','Line','Point' }
    'ReadMode','block', ...
    };
A = txt2mat('zahra2.txt',t2mOptions{:});
.
gives
     1 1 2 3 4 1 5 6
     2 1 2 3 7 2 5 7
     1 1 2 3 8 1 5 8
     2 1 2 3 9 2 5 9
.
You may have to modify this according to your specific file.
.
- 2nd attempt (more general, but a bit slower) - you just know you want to combine two consecutive rows:
t2mOptions = {
    'NumHeaderLines',5, ... % 0 would be ok, too
    'NumColumns',-1, ...
    'ReadMode','line', ...
    };
B = txt2mat('zahra2.txt',t2mOptions{:});
% now rearrange numbers:
dataRow = find(isfinite(B(:,1)));
B = B(dataRow,:);
dataCol{1} = find(any(isfinite(B(1:2:end,:)),1));
dataCol{2} = find(any(isfinite(B(2:2:end,:)),1));
B = [B(1:2:end,dataCol{1}),B(2:2:end,dataCol{2})];
.
gives the same result for B. But note there must not be continous NaNs in one data position in your file, otherwise their column would be omitted.
Hth
Andres

Subject: ascii import

From: Zahra

Date: 11 Nov, 2008 13:38:02

Message: 6 of 6

Hi Andres,

The second method that you described in your last message works perfectly with my data files which have complicated header lines.

Indeed txt2mat.m is a powerful code.

Thanks very much again for all your help.

Zahra

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
nonuniform data... Andres 10 Nov, 2008 15:45:07
read ascii file Zahra 10 Nov, 2008 15:21:12
rssFeed for this Thread
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com