Import Block of Mixed Data from Text File into Table or Cell Array

This example reads a block of mixed text and numeric data from a text file, and then imports the block of data into a table or a cell array.

Data File Overview

The sample file bigfile.txt contains commented lines beginning with ##. The data is arranged in five columns: The first column contains text indicating timestamps. The second, third, and fourth columns contain numeric data indicating temperature, humidity and wind speed. The last column contains descriptive text. Display the contents of the file bigfile.txt.

type('bigfile.txt')
## A	ID = 02476
## YKZ Timestamp Temp Humidity Wind Weather
06-Sep-2013 01:00:00	6.6	89	4	clear
06-Sep-2013 05:00:00	5.9	95	1	clear
06-Sep-2013 09:00:00	15.6	51	5	mainly clear
06-Sep-2013 13:00:00	19.6	37	10	mainly clear
06-Sep-2013 17:00:00	22.4	41	9	mostly cloudy
06-Sep-2013 21:00:00	17.3	67	7	mainly clear
## B	ID = 02477
## YVR Timestamp Temp Humidity Wind Weather
09-Sep-2013 01:00:00	15.2	91	8	clear
09-Sep-2013 05:00:00	19.1	94	7	n/a
09-Sep-2013 09:00:00	18.5	94	4	fog
09-Sep-2013 13:00:00	20.1	81	15	mainly clear
09-Sep-2013 17:00:00	20.1	77	17	n/a
09-Sep-2013 18:00:00	20.0	75	17	n/a
09-Sep-2013 21:00:00	16.8	90	25	mainly clear
## C	ID = 02478
## YYZ Timestamp Temp Humidity Wind Weather

Import Block of Data as Table

To import the data as a table, use readtable with import options.

Create an import options object for the file using the detectImportOptions function. Specify the location of the data using the DataLines property. For example, lines 3 through 8 contain the first block of data. Optionally, you can specify the names of the variables using the VariableNames property. Finally import the first block of data using readtable with the opts object.

opts = detectImportOptions('bigfile.txt'); 
opts.DataLines = [3 8];
opts.VariableNames = {'Timestamp','Temp',...
                      'Humidity','Wind','Weather'};
T_first = readtable('bigfile.txt',opts) 
T_first=6×5 table
         Timestamp          Temp    Humidity    Wind         Weather     
    ____________________    ____    ________    ____    _________________

    06-Sep-2013 01:00:00     6.6       89         4     {'clear'        }
    06-Sep-2013 05:00:00     5.9       95         1     {'clear'        }
    06-Sep-2013 09:00:00    15.6       51         5     {'mainly clear' }
    06-Sep-2013 13:00:00    19.6       37        10     {'mainly clear' }
    06-Sep-2013 17:00:00    22.4       41         9     {'mostly cloudy'}
    06-Sep-2013 21:00:00    17.3       67         7     {'mainly clear' }

Read the second block by updating the DataLines property to the location of the second block.

opts.DataLines = [11 17];
T_second = readtable('bigfile.txt',opts)
T_second=7×5 table
         Timestamp          Temp    Humidity    Wind        Weather     
    ____________________    ____    ________    ____    ________________

    09-Sep-2013 01:00:00    15.2       91         8     {'clear'       }
    09-Sep-2013 05:00:00    19.1       94         7     {'n/a'         }
    09-Sep-2013 09:00:00    18.5       94         4     {'fog'         }
    09-Sep-2013 13:00:00    20.1       81        15     {'mainly clear'}
    09-Sep-2013 17:00:00    20.1       77        17     {'n/a'         }
    09-Sep-2013 18:00:00      20       75        17     {'n/a'         }
    09-Sep-2013 21:00:00    16.8       90        25     {'mainly clear'}

Import Block of Data as Cell Array

You can import the data as cell array using the readcell function with detectImportOptions, or by using the textscan function. First import the block of data using the readcell function and then perform the same import by using textscan.

To perform the import using the readcell function, create an import options object for the file using the detectImportOptions function. Specify the location of the data using the DataLines property. Then, perform the import operation using the readcell function and import options object opts.

opts = detectImportOptions('bigfile.txt'); 
opts.DataLines = [3 8]; % fist block of data
C = readcell('bigfile.txt',opts)
C=6×5 cell
  Columns 1 through 4

    {[06-Sep-2013 01:00:00]}    {[ 6.6000]}    {[89]}    {[ 4]}
    {[06-Sep-2013 05:00:00]}    {[ 5.9000]}    {[95]}    {[ 1]}
    {[06-Sep-2013 09:00:00]}    {[15.6000]}    {[51]}    {[ 5]}
    {[06-Sep-2013 13:00:00]}    {[19.6000]}    {[37]}    {[10]}
    {[06-Sep-2013 17:00:00]}    {[22.4000]}    {[41]}    {[ 9]}
    {[06-Sep-2013 21:00:00]}    {[17.3000]}    {[67]}    {[ 7]}

  Column 5

    {'clear'        }
    {'clear'        }
    {'mainly clear' }
    {'mainly clear' }
    {'mostly cloudy'}
    {'mainly clear' }

To perform the import using the textscan function, specify the size of block using N and the format of the data fields using formatSpec. For example, use '%s' for text variables, '%D' for date and time variables, or '%c' for categorical variables. Use fopen to open the file. The function then returns a file identifier, fileID. Next, read from the file by using the textscan function.

N = 6;
formatSpec = '%D %f %f %f %c';
fileID = fopen('bigfile.txt');

Read the first block and display the contents of the variable Humidity.

C_first = textscan(fileID,formatSpec,N,'CommentStyle','##','Delimiter','\t')
C_first=1×5 cell
  Columns 1 through 4

    {6x1 datetime}    {6x1 double}    {6x1 double}    {6x1 double}

  Column 5

    {6x1 char}

C_first{3}
ans = 6×1

    89
   NaN
    95
   NaN
    51
   NaN

Update the block size N, and read the second block. Display the contents of the fifth variable Weather.

N = 7;
C_second = textscan(fileID,formatSpec,N,'CommentStyle','##','Delimiter','\t')
C_second=1×5 cell
  Columns 1 through 4

    {7x1 datetime}    {7x1 double}    {7x1 double}    {7x1 double}

  Column 5

    {7x1 char}

C_second{5}
ans = 7x1 char array
    'm'
    '...'
    'm'
    '...'
    'm'
    '...'
    'c'

Close the file.

fclose(fileID);

See Also

| | | |

Related Topics