Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
ascii import

Subject: ascii import

From: Zahra

Date: 10 Nov, 2008 19:32:02

Message: 1 of 6

Hi all,

Can any one help for the following:

I need to import data from an ascii file. The file includes different data runs that are separated by a header (the text of the header changes from one run to the next but the number of header lines remain the same for different runs). Then each run has the same number of data points. The only problem is that for each data point, the data is given in two lines (there is a line folding) starting with the point number in the run. Here is an example of the data file, lets assume there are two runs in it and each run has two points:
--------------------------------
Run 1
header line 1
header line 2
Point1 v1 v2 v3 v4
Point1 v5 v6
Point2 v1' v2' v3' v4'
Point2 v5' v6'

Run 2
header line 1
header line 2
Point1 vv1 vv2 vv3 vv4
Point1 vv5 vv6
Point2 vv1' vv2' vv3' vv4'
Point2 vv5' vv6'
-------------------------------------------
and so on...

What I need is to remove all the header lines and have a matrix that each of its rows represents a data point and its column are the data values, i.e. the rows of the final matrix need to be in the following format:

row 1=[Point1 v1 v2 v3 v4 v5 v6]
row 2=[Point2 v1' v2' v3' v4' v5' v6']
row 3=[Point1 vv1 vv2 vv3 vv4 vv5 vv6]
row 4=[Point2 vv1' vv2' vv3' vv4' vv5' vv6']

I have tried different options in txt2mat, but with each option there is one problem or another.

Can any one help?

Thanks,
Zahra

Subject: ascii import

From: NZTideMan

Date: 10 Nov, 2008 20:24:14

Message: 2 of 6

On Nov 11, 8:32=A0am, "Zahra" <zahra.yam...@nrc.gc.ca> wrote:
> Hi all,
>
> Can any one help for the following:
>
> I need to import data from an ascii file. The file includes different dat=
a runs that are separated by a header (the text of the header changes from =
one run to the next but the number of header lines remain the same for diff=
erent runs). Then each run has the same number of data points. The only pro=
blem is that for each data point, the data is given in two lines (there is =
a line folding) starting with the point number in the run. Here is an examp=
le of the data file, lets assume there are two runs in it and each run has =
two points:
> --------------------------------
> Run =A0 1 =A0
> header line 1
> header line 2
> Point1 =A0 v1 =A0 v2 =A0 v3 =A0 v4
> Point1 =A0 v5 =A0 v6
> Point2 =A0 v1' =A0 v2' =A0 v3' =A0 v4'
> Point2 =A0 v5' =A0 v6'
>
> Run =A02 =A0
> header line 1
> header line 2
> Point1 =A0 vv1 =A0 vv2 =A0 vv3 =A0 vv4
> Point1 =A0 vv5 =A0 vv6
> Point2 =A0 vv1' =A0 vv2' =A0 vv3' =A0 vv4'
> Point2 =A0 vv5' =A0 vv6'
> -------------------------------------------
> and so on...
>
> What I need is to remove all the header lines and have a matrix that each=
 of its rows represents a data point and its column are the data values, i.=
e. the rows of the final matrix need to be in the following format:
>
> row 1=3D[Point1 =A0 v1 =A0 v2 =A0 v3 =A0 v4 =A0 v5 =A0 v6]
> row 2=3D[Point2 =A0 v1' =A0 v2' =A0 v3' =A0 v4' =A0 v5' =A0 v6']
> row 3=3D[Point1 =A0 vv1 =A0 vv2 =A0 vv3 =A0 vv4 =A0 vv5 =A0 vv6]
> row 4=3D[Point2 =A0 vv1' =A0 vv2' =A0 vv3' =A0 vv4' =A0 vv5' =A0 vv6']
>
> I have tried different options in txt2mat, but with each option there is =
one problem or another.
>
> Can any one help?
>
> Thanks,
> Zahra

I'm not familiar with txt2mat, but this is what I'd do:

fid=3Dfopen(txtfile,'rt');
A=3Dzeros(16,nruns); % Allocate storage for the data
for irun=3D1:nruns % Loop thru the runs
  Run{irun}=3Dfgetl(fid);
  header1{irun}=3Dfgetl(fid);
  header2{irun}=3Dfgetl(fid);
  A(:,irun)=3Dfscanf(fid,'%f',16); % read in the 16 data from each run
  dum=3Dfgetl(fid); % read in blank line and discard
end
fclose(fid);

Now you have to manipulate A into the form you want by eliminating
unnecessary rows, then transposing and reshaping.

Subject: ascii import

From: Andres

Date: 10 Nov, 2008 20:44:01

Message: 3 of 6

"Zahra" <zahra.yamani@nrc.gc.ca> wrote in message <gfa27i$kr6$1@fred.mathworks.com>...
> Hi all,
>
> Can any one help for the following:
>
> I need to import data from an ascii file. The file includes different data runs that are separated by a header (the text of the header changes from one run to the next but the number of header lines remain the same for different runs). Then each run has the same number of data points. The only problem is that for each data point, the data is given in two lines (there is a line folding) starting with the point number in the run. Here is an example of the data file, lets assume there are two runs in it and each run has two points:
> --------------------------------
> Run 1
> header line 1
> header line 2
> Point1 v1 v2 v3 v4
> Point1 v5 v6
> Point2 v1' v2' v3' v4'
> Point2 v5' v6'
>
> Run 2
> header line 1
> header line 2
> Point1 vv1 vv2 vv3 vv4
> Point1 vv5 vv6
> Point2 vv1' vv2' vv3' vv4'
> Point2 vv5' vv6'
> -------------------------------------------
> and so on...
>
> What I need is to remove all the header lines and have a matrix that each of its rows represents a data point and its column are the data values, i.e. the rows of the final matrix need to be in the following format:
>
> row 1=[Point1 v1 v2 v3 v4 v5 v6]
> row 2=[Point2 v1' v2' v3' v4' v5' v6']
> row 3=[Point1 vv1 vv2 vv3 vv4 vv5 vv6]
> row 4=[Point2 vv1' vv2' vv3' vv4' vv5' vv6']
>
.
Let's make a test file:
.
Run 1
header line 1
header line 2
Point1 1 2 3 4
Point1 5 6
Point2 1 2 3 7
Point2 5 7
Run 2
header line 1
header line 2
Point1 1 2 3 8
Point1 5 8
Point2 1 2 3 9
Point2 5 9
.
Two txt2mat alternatives out of the box:
Case 1)
each interim header line can be identified by a set of key phrases, here:
'Run' and 'header'
.
t2mOptions = {
    'NumHeaderLines',3, ...
    'NumColumns',8, ...
    'ReplaceExpr',{{'Point',' '}}, ...
    'BadLineString',{'Run','header'}, ...
    'ReadMode','block', ...
    };
A = txt2mat('zahra.txt',t2mOptions{:});
.
gives:
     1 1 2 3 4 1 5 6
     2 1 2 3 7 2 5 7
     1 1 2 3 8 1 5 8
     2 1 2 3 9 2 5 9
Note that the point number is repeated in the 6th column, you'll probably want to delete this column.
.
Case 2)
the interim header lines can be not identified by any special phrase or character
.
t2mOptions = {
    'NumHeaderLines',3, ...
    'NumColumns',-1, ...
    'ReplaceExpr',{{'Point',' '}}, ...
    'ReadMode','line', ...
    };
B = txt2mat('zahra.txt',t2mOptions{:});
.
gives
     1 1 2 3 4
     1 5 6 NaN NaN
     2 1 2 3 7
     2 5 7 NaN NaN
   NaN NaN NaN NaN NaN
   NaN NaN NaN NaN NaN
   NaN NaN NaN NaN NaN
     1 1 2 3 8
     1 5 8 NaN NaN
     2 1 2 3 9
     2 5 9 NaN NaN
.
this obviously needs further post-processing with elementary matrix indexing operations, but all your numbers are there.
.
I hope you can adopt this to your problem. There are many possibilities without txt2mat of course - as I just read one above.
Regards
Andres

Subject: ascii import

From: Zahra

Date: 10 Nov, 2008 21:30:03

Message: 4 of 6

Hi Andreas,

Thanks for your reply.

I should have been more precise in defining my data files. They actually look like:
--------------------------

Run 1
header line 1
header line 2

Point zeta eta gamma delta
Point sig1 sig2
1 1 2 3 4
1 5 6
2 1 2 3 7
2 5 7

Run 2
header line 1
header line 2

Point zeta eta gamma delta
Point sig1 sig2
1 1 2 3 8
1 5 8
2 1 2 3 9
2 5 9
---------------------------------

i.e. the word "Point" is not repeated at the begining of each data point just in the header line, instead the actual points in the run are recorded as shown above. Will it stil be possible to make the data matrix similar to the data file that you had assumed?

Thanks again.
Zahra

Subject: ascii import

From: Andres

Date: 11 Nov, 2008 08:31:02

Message: 5 of 6

This should just make things easier.
- 1st attempt
you can identify the header lines (e.g. by character 'n' as in 'Run','Line','Point' who must not appear inside the number lines) and you know how many numbers you want to put into a single row of the matrix (8):
.
t2mOptions = {
    'NumHeaderLines',5, ... % 0 would be ok, too
    'NumColumns',8, ...
    'BadLineString',{'n'}, ... % or e.g. {'Run','Line','Point' }
    'ReadMode','block', ...
    };
A = txt2mat('zahra2.txt',t2mOptions{:});
.
gives
     1 1 2 3 4 1 5 6
     2 1 2 3 7 2 5 7
     1 1 2 3 8 1 5 8
     2 1 2 3 9 2 5 9
.
You may have to modify this according to your specific file.
.
- 2nd attempt (more general, but a bit slower) - you just know you want to combine two consecutive rows:
t2mOptions = {
    'NumHeaderLines',5, ... % 0 would be ok, too
    'NumColumns',-1, ...
    'ReadMode','line', ...
    };
B = txt2mat('zahra2.txt',t2mOptions{:});
% now rearrange numbers:
dataRow = find(isfinite(B(:,1)));
B = B(dataRow,:);
dataCol{1} = find(any(isfinite(B(1:2:end,:)),1));
dataCol{2} = find(any(isfinite(B(2:2:end,:)),1));
B = [B(1:2:end,dataCol{1}),B(2:2:end,dataCol{2})];
.
gives the same result for B. But note there must not be continous NaNs in one data position in your file, otherwise their column would be omitted.
Hth
Andres

Subject: ascii import

From: Zahra

Date: 11 Nov, 2008 13:38:02

Message: 6 of 6

Hi Andres,

The second method that you described in your last message works perfectly with my data files which have complicated header lines.

Indeed txt2mat.m is a powerful code.

Thanks very much again for all your help.

Zahra

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us