Importing Text Data

The MATLAB® Import Wizard

The easiest way to import data into your MATLAB application is to use the Import Wizard. You do not need to know the format of the data to use this tool. You simply specify the file that contains the data and the Import Wizard processes the file contents automatically.

For more information, see Using the Import Wizard.

Using Import Functions with Text Data

To import text data from the command line or in an M-file, you must use one of the MATLAB import functions. Your choice of function depends on how the data in the text file is formatted.

The text data must be formatted in a uniform pattern of rows and columns, using a text character, called a delimiter or column separator, to separate each data item. The delimiter can be a space, comma, semicolon, tab, or any other character. The individual data items can be alphabetic or numeric characters or a mix of both.

The text file can also contain one or more lines of text, called header lines, or can use text headers to label each column or row. The following example illustrates a tab-delimited text file with header text and row and column headers.

To find out how your data is formatted, view it in a text editor. After you determine the format, find the sample in the table below that most closely resembles the format of your data. Then read the topic referred to in the table for information on how to import that format.

Table 6-1. ASCII Data File Formats

Data Format Sample

File Extension

Description

1 2 3 4 5
6 7 8 9 10

.txt
.dat
or other

See Importing Numeric Text Data or Using the Import Wizard.

1; 2; 3; 4; 5
6; 7; 8; 9; 10
or
1, 2, 3, 4, 5
6, 7, 8, 9, 10

.txt
.dat
.csv
or other

See Importing Delimited ASCII Data Files or Using the Import Wizard.

Ann Type1 12.34 45 Yes
Joe Type2 45.67 67 No

.txt
.dat
or other

See Importing Mixed Alphabetic and Numeric Data.

Grade1 Grade2 Grade3
91.5   89.2   77.3
88.0   67.8   91.0
67.3    78.1   92.5

.txt
.dat
or other

See Importing Numeric Data with Text Headers or Using the Import Wizard.

If you are familiar with MATLAB import functions, but are not sure when to use them, see the following table which compares the features of each function.

Table 6-2. ASCII Data Import Function Features

Function

Data Type

Delimiters

Number of Return Values

Notes

csvread

Numeric data

Commas only

One

Primarily used with spreadsheet data. See Working with Spreadsheets.

dlmread

Numeric data

Any character

One

Flexible and easy to use.

fscanf

Alphabetic and numeric; however, both types returned in a single return variable

Any character

One

Part of low-level file I/O routines. Requires use of fopen to obtain file identifier and fclose after read.

load

Numeric data

Spaces only

One

Easy to use. Use the functional form of load to specify the name of the output variable.

textread

Alphabetic and numeric

Any character

Multiple values in cell arrays

Flexible, powerful, and easy to use. Use format string to specify conversions.

textscan

Alphabetic and numeric

Any character

Multiple values returned to one cell array

More flexible than textread. Also more format options.

Importing Numeric Text Data

If your data file contains only numeric data, you can use many of the MATLAB import functions (listed in ASCII Data Import Function Features), depending on how the data is delimited. If the data is rectangular, that is, each row has the same number of elements, the simplest command to use is the load command. (You can also use the load function to import MAT-files, the MATLAB binary format for saving the workspace.)

For example, the file named my_data.txt contains two rows of numbers delimited by space characters:

1 2 3 4 5
6 7 8 9 10

When you use load as a command, it imports the data and creates a variable in the workspace with the same name as the filename, minus the file extension:

load my_data.txt;
whos
   Name         Size        Bytes   Class

   my_data      2x5          80    double array

my_data

my_data =
    1   2   3   4   5
    6   7   8   9   10

If you want to name the workspace variable something other than the filename, use the functional form of load. In the following example, the data from my_data.txt is loaded into the workspace variable A:

A = load('my_data.txt');

Importing Delimited ASCII Data Files

If your data file uses a character other than a space as a delimiter, you have a choice of several import functions you can use. (For a complete list, see ASCII Data Import Function Features.) The simplest function to use is dlmread.

For example, consider a file named ph.dat whose contents are separated by semicolons:

7.2;8.5;6.2;6.6
5.4;9.2;8.1;7.2

To read the entire contents of this file into an array named A, enter

A = dlmread('ph.dat', ';');

You specify the delimiter used in the data file as the second argument to dlmread. Note that, even though the last items in each row are not followed by a delimiter, dlmread can still process the file correctly. dlmread ignores space characters between data elements. So, the preceding dlmread command works even if the contents of ph.dat are

7.2;   8.5;       6.2;6.6
5.4;   9.2   ;8.1;7.2

Importing Numeric Data with Text Headers

To import an ASCII data file that contains text headers, use the textscan function, specifying the headerlines parameter. textscan accepts a set of predefined parameters that control various aspects of the conversion. (For a complete list of these parameters, see the textscan reference page.) Using the headerlines parameter, you can specify the number of lines at the head of the file that textscan should ignore.

For example, the file grades.dat contains formatted numeric data with a one-line text header:

Grade1 Grade2 Grade3
  78.8   55.9   45.9
  99.5   66.8   78.0
  89.5   77.0   56.7

To import this data, first open the file, and then use this textscan command to read the contents:

fid = fopen('grades.dat', 'r');
grades = textscan(fid, '%f %f %f', 3, 'headerlines', 1);

grades{:}
ans =
   78.8000
   99.5000
   89.5000

ans =
   55.9000
   66.8000
   77.0000

ans =
   45.9000
   78.0000
   56.7000

fclose(fid);

Importing Mixed Alphabetic and Numeric Data

If your data file contains a mix of alphabetic and numeric ASCII data, use the textscan or textread function to import the data. textscan returns its output in a single cell array, while textread returns its output in separate variables and you can specify the class of each variable. The textscan function offers better performance than textread, making it a better choice when reading large files.

This example uses textread to import the file mydata.dat that contains a mix of alphabetic and numeric data:

Sally    Type1 12.34 45 Yes
Larry    Type2 34.56 54 Yes
Tommy    Type1 67.89 23 No

To read the entire contents of the file mydata.dat into the workspace, specify the name of the data file and the format string as arguments to textread. In the format string, you include conversion specifiers that define how you want each data item to be interpreted. For example, specify %s for string data, %f for floating-point data, and so on. (For a complete list of format specifiers, see the textread reference page.)

For each conversion specifier in your format string, you must specify a separate output variable. textread processes each data item in the file as specified in the format string and puts the value in the output variable. The number of output variables must match the number of conversion specifiers in the format string.

In this example, textread reads the file mydata.dat, applying the format string to each line in the file until the end of the file:

[names, types, x, y, answer] = ...
   textread('mydata.dat', '%s %s %f %d %s', 3)
names = 
    'Sally'
    'Larry'
    'Tommy'

types = 
    'Type1'
    'Type2'
    'Type1'

x =
   12.3400
   34.5600
   67.8900

y =
    45
    54
    23

answer = 
    'Yes'
    'Yes'
    'No'

If your data uses a character other than a space as a delimiter, you must use the textread parameter 'delimiter' to specify the delimiter. For example, if the file mydata.dat used a semicolon as a delimiter, you would use this command:

[names, types, x, y, answer]= ...
   textread('mydata.dat', '%s %s %f %d %s', 'delimiter', ';')

For more information about these optional parameters, see the textread reference page.

Importing from XML Documents

With the xmlread function, you can read from a given URL or file, generating a Document Object Model (DOM) node to represent the parsed document.

MATLAB also provides these other XML functions:

For more information, see the reference pages for these functions.

  


 © 1984-2008- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS