How to read in txt file where certain columns are not separated by delimiters?

1 view (last 30 days)
Hello,
I have some data in a .txt file that is formatted like this:
28.7 89.3 70 983 -99 -99 1008 150 150150100100 100100 50 50 25 25 25 25
The 9th column is supposed to be '150 150 100 100', while the 10th is supposed to be '100 100'. For some of the data, there are no spaces between numbers, so conventionally reading in .txt files using textscan does not work. I would like to have the data so that the row reads like this:
28.7 89.3 70 983 -99 -99 1008 150 150 150 100 100 100 100 50 50 25 25 25 25
where each space represents a different column.
Any suggestions? Thanks very much.
  1 Comment
Geoff
Geoff on 24 Jul 2012
Does it always occur in the same position? You can read out 3 characters for each of those fields and then convert them to integers. There's a number of ways to do that.

Sign in to comment.

Answers (2)

Image Analyst
Image Analyst on 24 Jul 2012
You can read in the string, use strfind() to locate the 8th space, take the next word and insert the spaces you need, like at every third character position. You might find John D'Errico's allwords very useful: http://www.mathworks.com/matlabcentral/fileexchange/27184-allwords If you use allwords, just take the 9th word and insert spaces. Then, once you've fixed it by inserting the needed spaces, you can use textscan() to parse it.

per isakson
per isakson on 24 Jul 2012
Edited: per isakson on 24 Jul 2012
There must be a better way to read fix-format text (fortran). It should be possible this way - I think. However, it's error prone and I had to use "3" in one position where I counted to "4" (trial and error). And the lines are too long. Maybe all %u should be replaced by %d to acount for -99.
str = '28.7 89.3 70 983 -99 -99 1008 150 150150100100 100100 50 50 25 25 25 25';
% 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0
%
% 28.7 89.3 70 983 -99 -99 1008 150 150 150 100 100 100 100 50 50 25 25 25 25
%
cac = textscan( str, '%4f%5f%3d%4d%4d%4d%5d%4u%3u%3u%3u%3u%3u%3u%3u%3u%3u%3u%3u%3u' ...
, 'Delimiter', '', 'Whitespace', '' )
cac =
Columns 1 through 11
[28.7000] [89.3000] [70] [983] [-99] [-99] [1008] [150] [150] [150] [100]
Columns 12 through 20
[100] [100] [100] [50] [50] [25] [25] [25] [25]
>>
ERROR :( --- No it returns the correct result?
It is possible
  1. to read the full line as one string and
  2. use the known widths to chop it into columns and
  3. convert to numbers

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!