Got Questions? Get Answers.
Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
textscan with fixed witdh colums

Subject: textscan with fixed witdh colums

From: Benjamin

Date: 11 May, 2009 17:45:05

Message: 1 of 8

Hi folks,

I got a problem importing data from a text file (HITRAN database). The format is fixed with non delimited with leading zeros. Here is an example of the first 5 rows and the first 7 columns of such a file:

 23 0.736369 6.284E-33 3.699E-14.09490.128 ...
 24 0.757206 1.307E-34 1.240E-14.09490.128 ...
 24 0.757232 1.743E-34 1.240E-14.09490.128 ...
 24 0.757245 8.715E-35 1.240E-14.09490.128 ...
 23 1.472734 4.000E-32 2.840E-13.08930.123 ...

The first field is a unsigned int with column width 2, the second is a unsigned int with column with 1, then a floating point value with width 12 follows, and so on. To make it clear here is the first line with '|' as delimiter:

 2|3| 0.736369| 6.284E-33| 3.699E-14|.0949|0.128 ...

I tried to import this data with

textscan(fid,'%2u%1u%12f%10f%10f%5f%5f%*[^\n]',5,'delimiter','\n','whitespace','')

The problem that occured is that the leading space at the beginning of the line is not ignored even when I set 'whitespace' to '', and the first two importet columns are

{23,24,24,24,23} {0,0,0,0,1}

instead of

{2,2,2,2,2} {3,4,4,4,3}.

So i get a shift in the columns. I can't programm a workaround by simply say that the first column has just width one ('%1u') as it happens that the value has two digits in the first column.
Also interessting is, that everything works as intended if I use

textscan(fid,'%2s%1s%12s%10s%10s%5s%5s%*[^\n]',5,'delimiter','\n','whitespace','').

But then I got to convert every column from string to its value, which is not the spirit of textscan I guess.

Any ideas to solve this?

Subject: textscan with fixed witdh colums

From: Vadim Teverovsky

Date: 11 May, 2009 18:43:26

Message: 2 of 8

Could you use just %2s for the first column, and the integer formats
elsewhere? Still not ideal, but better than all of them being strings.

"Benjamin " <anonymous@email.net> wrote in message
news:gu9o71$7hk$1@fred.mathworks.com...
> Hi folks,
>
> I got a problem importing data from a text file (HITRAN database). The
> format is fixed with non delimited with leading zeros. Here is an example
> of the first 5 rows and the first 7 columns of such a file:
>
> 23 0.736369 6.284E-33 3.699E-14.09490.128 ...
> 24 0.757206 1.307E-34 1.240E-14.09490.128 ...
> 24 0.757232 1.743E-34 1.240E-14.09490.128 ...
> 24 0.757245 8.715E-35 1.240E-14.09490.128 ...
> 23 1.472734 4.000E-32 2.840E-13.08930.123 ...
>
> The first field is a unsigned int with column width 2, the second is a
> unsigned int with column with 1, then a floating point value with width 12
> follows, and so on. To make it clear here is the first line with '|' as
> delimiter:
>
> 2|3| 0.736369| 6.284E-33| 3.699E-14|.0949|0.128 ...
>
> I tried to import this data with
>
> textscan(fid,'%2u%1u%12f%10f%10f%5f%5f%*[^\n]',5,'delimiter','\n','whitespace','')
>
> The problem that occured is that the leading space at the beginning of the
> line is not ignored even when I set 'whitespace' to '', and the first two
> importet columns are
>
> {23,24,24,24,23} {0,0,0,0,1}
>
> instead of
>
> {2,2,2,2,2} {3,4,4,4,3}.
>
> So i get a shift in the columns. I can't programm a workaround by simply
> say that the first column has just width one ('%1u') as it happens that
> the value has two digits in the first column.
> Also interessting is, that everything works as intended if I use
>
> textscan(fid,'%2s%1s%12s%10s%10s%5s%5s%*[^\n]',5,'delimiter','\n','whitespace','').
>
> But then I got to convert every column from string to its value, which is
> not the spirit of textscan I guess.
>
> Any ideas to solve this?

Subject: textscan with fixed witdh colums

From: Chris Kearney

Date: 11 May, 2009 20:33:46

Message: 3 of 8

Since textscan returns {23,24,24,24,23} {0,0,0,0,1} ....

take that first set of numbers and operate on them to break them down
to what you wanted.

first_col = floor(num_from_texscan / 10);
second_col = num_from_textscan - first_col * 10;

Subject: textscan with fixed witdh colums

From: Benjamin L?hden

Date: 12 May, 2009 09:15:06

Message: 4 of 8

"Vadim Teverovsky" <vteverov@mathworks.com> wrote in message <gu9rke$1kr$1@fred.mathworks.com>...
> Could you use just %2s for the first column, and the integer formats
> elsewhere? Still not ideal, but better than all of them being strings.

I tried '%2s %1u %12f %10f %10f %5f %5f %*[^\n]' and it solves the problem for lines wich are looking like the 5 examples I showed. But if one value of the following columns uses its whole width like column 4 in this example

 2|3| 0.736369|26.284E-33| 3.699E-14|.0949|0.128

I still got the same problem for the value directly in front of it (column 3). Matlab seems to ignore the leading whitespaces and seperates like:

 2|3| 0.73636926|.284E-33| 3.699E-14|.0949|0.128

So if I want the whitespaces not to be ignored I still have to use '%s' for all columns which is really annoying.

Subject: textscan with fixed witdh colums

From: Benjamin L?hden

Date: 12 May, 2009 09:18:01

Message: 5 of 8

Chris Kearney <Stickman84@gmail.com> wrote in message <845877a2-80c5-4659-98b1-1619c606ccae@u39g2000pru.googlegroups.com>...
> Since textscan returns {23,24,24,24,23} {0,0,0,0,1} ....
>
> take that first set of numbers and operate on them to break them down
> to what you wanted.
>
> first_col = floor(num_from_texscan / 10);
> second_col = num_from_textscan - first_col * 10;

The problem with this is, that the second cell array has totally wrong values. The whitespaces seem to be ignored and all following columns get shiftet depending on how the value of the column fits into the fixed width.

Subject: textscan with fixed witdh colums

From: Andres

Date: 12 May, 2009 17:09:01

Message: 6 of 8

Hi Benjamin,

I suggest you replace the spaces with '0's before you convert the file.

You can still use textscan, I just write it down here with my function txt2mat, which has a replacement option built in
( http://www.mathworks.de/matlabcentral/fileexchange/18430 ):

fn = 'C:\temp\benjamin.txt';
cw = [2 1 9 10 10 5 5]; % colum widths

t2mOpts = ...
    {'NumHeaderLines', 0, ...
     'NumColumns' , numel(cw), ...
     'ReplaceChar' , {' 0'}, ...
     'ConvString' , sprintf('%%%gf ', cw)};
       
A = txt2mat(fn, t2mOpts{:});

 
I checked the code on a slightly modified example:

 23 0.73636996.284E-33 3.699E-14.09490.128
 24 0.757206 1.307E-34 1.240E-14.09490.128
 24 0.757232 1.743E-34 1.240E-14.09490.128
924 0.757245 8.715E-35 1.240E-14.09490.128
 23 1.472734 4.000E-3292.840E-13.08930.123

and it yields

A(:,1) = [2 2 2 92 2].'
A(:,2) = [3 4 4 4 3].'
etc.

Regards
Andres
%}

Subject: textscan with fixed witdh colums

From: Andres

Date: 12 May, 2009 17:35:02

Message: 7 of 8

"Andres" <rantore@werb.deNoRs> wrote in message <gucafd$ieg$1@fred.mathworks.com>...
> Hi Benjamin,
>
> I suggest you replace the spaces with '0's before you convert the file.

assumptions made:
- you solely need to convert to numbers, not strings
- no ' -' (space + minus) occurs (perhaps care for that with further replacements)

Subject: textscan with fixed witdh colums

From: Benjamin Loehden

Date: 13 May, 2009 15:38:02

Message: 8 of 8

"Andres" <rantore@werb.deNoRs> wrote in message <gucc06$du$1@fred.mathworks.com>...
> "Andres" <rantore@werb.deNoRs> wrote in message <gucafd$ieg$1@fred.mathworks.com>...
> > Hi Benjamin,
> >
> > I suggest you replace the spaces with '0's before you convert the file.
>
> assumptions made:
> - you solely need to convert to numbers, not strings
> - no ' -' (space + minus) occurs (perhaps care for that with further replacements)

Thanks a lot for your reply, but the assumption no. 1 does not match as the colums 11-14 and 18 are string columns wich even do (and should) contain whitespace characters. Anyway I will have a look at your interessting code and may find a solution for me.

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us