MATLAB Examples

Define Import Options for Tables

Typically, you can import tables using the readtable function. However, sometimes importing tabular data requires additional control over the import process. For example, you might want to select the variables to import or handle rows with missing or error-causing data. Control the import process by creating an import options object. The object has properties that you can adjust based on your import needs.

Contents

Create Import Options

Create an import options object for a sample data set, airlinesmall.csv.

opts = detectImportOptions('airlinesmall.csv');

The detectImportOptions function creates a SpreadsheetImportOptions object for spreadsheet files and a DelimitedTextImportOptions object for text files.

Customize Table-Level Import Options

Set property values to define import options. Some options apply to the entire table, and some apply to specific variables. For example, rules to manage missing or error-causing data are defined by the table-wide MissingRule and ImportErrorRule properties.

opts.ImportErrorRule = 'omitrow';
opts.MissingRule = 'fill';

Setting ImportErrorRule to 'omitrow' removes rows with data that cause import errors. Setting MissingRule to 'fill' replaces missing values with values that are defined by the FillValue property. For instance, missing numeric values become NaN.

Customize Variable-Level Import Options

To get and set options for specific variables use the getvaropts, setvartype, and setvaropts functions. For example, view the current options for the variables named FlightNum, Origin, Dest, and ArrDelay, using the getvaropts function.

getvaropts(opts,{'FlightNum','Origin','Dest','ArrDelay'})
ans = 

  1x4 <a href="matlab:helpPopup matlab.io.VariableImportOptions" style="font-weight:bold">VariableImportOptions</a> array with properties:

   Variable Options:
                          (1) |      (2) |      (3) |        (4)  
            Name: 'FlightNum' | 'Origin' |   'Dest' | 'ArrDelay'
            Type:    'double' |   'char' |   'char' |   'double'
       FillValue:         NaN |       '' |       '' |        NaN
  TreatAsMissing:          {} |       {} |       {} |         {}
       QuoteRule:    'remove' | 'remove' | 'remove' |   'remove'

Change the data types for the variables using the setvartype function:

  • Since the values in the variable FlightNum are identifiers for the flight and not numerical values, change its data type to char.
  • Since the variables Origin and Dest designate a finite set of repeating text values, change their data type to categorical.
 opts = setvartype(opts,{'FlightNum','Origin','Dest','ArrDelay'},...
                        {'char','categorical','categorical','single'});

Change other properties using the setvaropts function:

  • For the FlightNum variable, remove any leading white spaces from the text by setting the WhiteSpaceRule property to trimleading.
  • For the ArrDelay variable, replace fields containing 0 or NA with the value specified in FillValue property by setting the TreatAsMissing property.
 opts = setvaropts(opts,'FlightNum','WhitespaceRule','trimleading');
 opts = setvaropts(opts,'ArrDelay','TreatAsMissing',{'0','NA'});

Import Table

Specify the variables to get, import them using readtable, and display the first 10 rows of the table.

opts.SelectedVariableNames = {'FlightNum','Origin','Dest','ArrDelay'};
T = readtable('airlinesmall.csv',opts);
T(1:10,:)
ans = 

    FlightNum    Origin    Dest    ArrDelay
    _________    ______    ____    ________

    '1503'       LAX       SJC      8      
    '1550'       SJC       BUR      8      
    '1589'       SAN       SMF     21      
    '1655'       BUR       SJC     13      
    '1702'       SMF       LAX      4      
    '1729'       LAX       SJC     59      
    '1763'       SAN       SFO      3      
    '1800'       SEA       LAX     11      
    '1831'       LAX       SJC      3      
    '1864'       SFO       LAS      2