Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: ascii import
Date: Mon, 10 Nov 2008 20:44:01 +0000 (UTC)
Organization: Pierburg GmbH
Lines: 100
Message-ID: <gfa6eh$nhu$1@fred.mathworks.com>
References: <gfa27i$kr6$1@fred.mathworks.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-02-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1226349841 24126 172.30.248.37 (10 Nov 2008 20:44:01 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Mon, 10 Nov 2008 20:44:01 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 872224
Xref: news.mathworks.com comp.soft-sys.matlab:500076


"Zahra" <zahra.yamani@nrc.gc.ca> wrote in message <gfa27i$kr6$1@fred.mathworks.com>...
> Hi all,
> 
> Can any one help for the following:
> 
> I need to import data from an ascii file. The file includes different data runs that are separated by a header (the text of the header changes from one run to the next but the number of header lines remain the same for different runs). Then each run has the same number of data points. The only problem is that for each data point, the data is given in two lines (there is a line folding) starting with the point number in the run. Here is an example of the data file, lets assume there are two runs in it and each run has two points:
> --------------------------------
> Run   1   
> header line 1
> header line 2
> Point1   v1   v2   v3   v4 
> Point1   v5   v6 
> Point2   v1'   v2'   v3'   v4' 
> Point2   v5'   v6' 
> 
> Run  2   
> header line 1
> header line 2
> Point1   vv1   vv2   vv3   vv4 
> Point1   vv5   vv6 
> Point2   vv1'   vv2'   vv3'   vv4' 
> Point2   vv5'   vv6'
> -------------------------------------------
> and so on...
> 
> What I need is to remove all the header lines and have a matrix that each of its rows represents a data point and its column are the data values, i.e. the rows of the final matrix need to be in the following format:
> 
> row 1=[Point1   v1   v2   v3   v4   v5   v6]
> row 2=[Point2   v1'   v2'   v3'   v4'   v5'   v6']
> row 3=[Point1   vv1   vv2   vv3   vv4   vv5   vv6]
> row 4=[Point2   vv1'   vv2'   vv3'   vv4'   vv5'   vv6']
> 
.
Let's make a test file:
.
Run 1
header line 1
header line 2
Point1 1 2 3 4
Point1 5 6
Point2 1 2 3 7
Point2 5 7
Run 2
header line 1
header line 2
Point1 1 2 3 8
Point1 5 8
Point2 1 2 3 9
Point2 5 9
.
Two txt2mat alternatives out of the box:
Case 1)
each interim header line can be identified by a set of key phrases, here:
'Run' and 'header'
.
t2mOptions = {
    'NumHeaderLines',3, ...
    'NumColumns',8, ...
    'ReplaceExpr',{{'Point','    '}}, ...
    'BadLineString',{'Run','header'}, ...
    'ReadMode','block', ...
    };
A = txt2mat('zahra.txt',t2mOptions{:});
.
gives:
     1     1     2     3     4     1     5     6
     2     1     2     3     7     2     5     7
     1     1     2     3     8     1     5     8
     2     1     2     3     9     2     5     9
Note that the point number is repeated in the 6th column, you'll probably want to delete this column.
.
Case 2) 
the interim header lines can be not identified by any special phrase or character
.
t2mOptions = {
    'NumHeaderLines',3, ...
    'NumColumns',-1, ...
    'ReplaceExpr',{{'Point','    '}}, ...
    'ReadMode','line', ...
    };
B = txt2mat('zahra.txt',t2mOptions{:});
.
gives
     1     1     2     3     4
     1     5     6   NaN   NaN
     2     1     2     3     7
     2     5     7   NaN   NaN
   NaN   NaN   NaN   NaN   NaN
   NaN   NaN   NaN   NaN   NaN
   NaN   NaN   NaN   NaN   NaN
     1     1     2     3     8
     1     5     8   NaN   NaN
     2     1     2     3     9
     2     5     9   NaN   NaN
.
this obviously needs further post-processing with elementary matrix indexing operations, but all your numbers are there.
.
I hope you can adopt this to your problem. There are many possibilities without txt2mat of course - as I just read one above.
Regards
Andres