Parsing a text file by character

4 views (last 30 days)
Ammar
Ammar on 8 Apr 2014
Commented: Alberto on 10 Apr 2014
I would like to parse a 331 string of characters into cells but the number of characters per cell is not the same for all?
For example, if I have text like the following
24359_435934009____________90909
where _=space
And I want to put it into a matrix such that
characters 1-5 go into column 1 (a.k.a 24359)
character 6 goes into column 2 (ak.a. NaN)
characters 7-10 goes into column 3
etc
The data comes as a .txt file.

Answers (2)

Alberto
Alberto on 8 Apr 2014
If there are several lines, you can extract them using this:
fid=fopen(fileName)
B = textscan(fid, '%f %f %f', 'Delimiter'-' ', 'MultipleDelimsAsOne', 1);
fclose(fid)
It extracts three columns, if you need more columns change the part '%f %f %f'. But this only works if there are 3 numbres in each row.
If you have diferent numbers in each row, you should parse using regexp for each line.

Ammar
Ammar on 9 Apr 2014
Edited: Ammar on 9 Apr 2014
That did not seem to work. I have attached example data. Each line should be 331 characters which includes spaces.
I want to parse the data into a matrix.
Column 1 should be characters 1-8 (so cell 1,1 should be 79006444)
Column 2 should be characters 9-18
Column 3 should be character 19
Column 4 should be characters 20-21
Column 5 should be characters 22
Column 6 should be character 23
Column 7 should be character 24
Column 8 should be characters 25-27
.....
areas with spaces need to be replaced with NaN
  1 Comment
Alberto
Alberto on 10 Apr 2014
The extraction mode depends on what you know about the structure of the file. I think this should work:
B = textscan(fid, '%f %f %f %f %f %f %f %f', 'Delimiter', ' ', 'MultipleDelimsAsOne', 1);

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!