Loading ASCII tables into MATLAB as strings to be processed

2 views (last 30 days)
I am working on a general script to automate plotting of selective columns and rows from large ASCII data sets.
My issue is I am having extreme difficulty loading the data into MATLAB in a way that would allow for searching through the data set. For example, an example data set may look like
  • 1055+018 1 2001 Oct 19 5 6 7
  • 1055+018 2 2002 Nov 21 8 9 10
  • 1055+018 2 2002 Dec 12 11 12 13
  • 5055+018 1 2000 Jan 10 14 15 16
  • 5055+018 1 2001 Feb 11 17 18 19
  • 5055+018 2 2002 Mar 12 20 21 22
I am attempting to write the code in such a way that the use could enter which item from column 1 they would want, and then the necessary rows and columns based off information found in the second row; however, due to the format of the ASCII data sets I cannot figure out a way to load data correct, even as strings, to read the whole dataset. Usually, it will just be unable to load due to the alphabetical months or cut off information depending on which loading feature I am using.
Does anyone have any suggestions on what to do?

Accepted Answer

per isakson
per isakson on 20 Jul 2012
Edited: per isakson on 20 Jul 2012
I assume that your data is in a text file. This code will read the file
fid = fopen( 'cssm.txt' );
cac = textscan( fid, '%d%d%d%d%s%d%d%d%d' );
fclose( fid )
where cssm.txt contains
1055+018 1 2001 Oct 19 5 6 7
1055+018 2 2002 Nov 21 8 9 10
...
There is a result in cac
>> cac =
Columns 1 through 7
[6x1 int32] [6x1 int32] [6x1 int32] [6x1 int32] {6x1 cell} [6x1 int32] [6x1 int32]
Columns 8 through 9
[6x1 int32] [6x1 int32]

More Answers (3)

Star Strider
Star Strider on 20 Jul 2012
Edited: Star Strider on 20 Jul 2012
This worked for me when I copy-pasted your data to a test routine:
Data = {'1055+018 1 2001 Oct 19 5 6 7'
'1055+018 2 2002 Nov 21 8 9 10'
'1055+018 2 2002 Dec 12 11 12 13'
'5055+018 1 2000 Jan 10 14 15 16'
'5055+018 1 2001 Feb 11 17 18 19'
'5055+018 2 2002 Mar 12 20 21 22'};
for k1 = 1:size(Data,1)
InputCell(k1,:) = textscan( Data{k1,:}, '%d%d %d %d %3c %d %d %d %d')
end
I got back:
InputCell =
[1055] [18] [1] [2001] 'Oct' [19] [ 5] [ 6] [ 7]
[1055] [18] [2] [2002] 'Nov' [21] [ 8] [ 9] [10]
[1055] [18] [2] [2002] 'Dec' [12] [11] [12] [13]
[5055] [18] [1] [2000] 'Jan' [10] [14] [15] [16]
[5055] [18] [1] [2001] 'Feb' [11] [17] [18] [19]
[5055] [18] [2] [2002] 'Mar' [12] [20] [21] [22]
Is that the sort of result you want? I don't use ‘textscan’ that much so others may have better solutions, but this might suggest a way for you to at least read your files. I refer you to the ‘textscan’ documentation for details.

Brendan Reardon
Brendan Reardon on 31 Jul 2012
My apologizes for the delay, but thank you very much for the assistance! The code which I used was,
fid = fopen('data.txt'); table = textscan( fid, '%d%d%s%s%s%s%s%s%s%s%s'); fclose (fid);
My issue before was my lack of understanding regarding the column formatting.

Brendan Reardon
Brendan Reardon on 31 Jul 2012
Everything so far has worked great; however, I am running into trouble once again. The actual data tables look like this for example,
0836+710 1 2003 Jan 8 0.060 12.82 210.4 2.43 1.00
0836+710 2 2003 Jan 8 0.045 7.89 217.4 3.19 1.00
0836+710 3 a 2003 Jan 8 0.089 3.01 215.6 0.91 1.00
Where every now and then an "a" will appear because the publisher of the data included it for commenting purposes. I originally planned to include this as any other column that will just have primarily empty elements; however, due to the formatting, Matlab reads it as a part of column 3 and pushes everything over for that specific row. Here is an example of code:
>> table{4}
ans =
'2003'
'2003'
'a'
Matlab reads the "years" column as such because the notation was added between a "tab". The next column reads
>> table{5}
ans =
'Jan'
'Jan'
'2003'
My originally approach was to use isletter to find the notations and pull them out of the years column; however, that is unsuccessful due to being a cell array, as shown below:
>> years = table{4}
years =
'2003'
'2003'
'a'
>> isletter(years)
ans =
0
0
0
Does anyone have suggestions to force matlab to read these "a" as a separate column? I have attempted to change upload formats such as %s%d and %f with no avail.

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!