This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Import Mixed Data from Text Files

This example shows how to use the readtable function to import mixed data from a text file into a table. Then, it shows how to modify and analyze the data in the table.

Sample File Overview

The sample file, outages.csv, contains data representing electric utility outages in the US. These are the first few lines of the file:

SouthWest,2002-01-20 11:49,672,2902379,2002-01-24 21:58,winter storm
SouthEast,2002-01-30 01:18,796,336436,2002-02-04 11:20,winter storm
SouthEast,2004-02-03 21:17,264.9,107083,2004-02-20 03:37,winter storm
West,2002-06-19 13:39,391.4,378990,2002-06-19 14:27,equipment fault

The file contains six columns. The first line in the file lists column titles for the data. These are the column titles, along with a description of the data in that column:

  • Region: Text value for one of five regions where each electrical outage occurred

  • OutageTime: Date and time at which the outage started, formatted as year-month-day hour:minute

  • Loss: Numeric value indicating the total power loss for the outage

  • Customers: Integer value indicating the number of customers impacted

  • RestorationTime: Date and time at which power was restored, formatted as year-month-day hour:minute

  • Cause: Category for the cause of the power outage, provided as text.

Specify Format of Data Fields

Create a character vector of format specifiers to describe the data in the text file. You can then pass the format specifiers to the readtable function to import the data. Because outages.csv contains six columns of data, create a character vector that contains six format specifiers, such as '%f' for a floating-point number, '%C' for a categorical value, and '%D' for a date and time value.

formatSpec = '%C%{yyyy-MM-dd HH:mm}D%f%f%{yyyy-MM-dd HH:mm}D%C';

formatSpec tells readtable to read the first and last columns in the file as categorical data, the second and fifth columns as formatted date and time data, and the third and fourth columns as floating-point values. For the %{yyyy-MM-dd HH:mm}D specifiers, the text between the curly braces describes the format of the date and time data.

Read Text File

Call readtable to read the file. Use the Delimiter name-value pair argument to specify the delimiter. The default delimiter is a comma. Use the Format name-value pair argument along with the formatSpec value to describe the format of the data fields in the file.

T = readtable('outages.csv','Delimiter',',', ...

readtable returns a table containing the outage data.

View the first five rows and first four variables of the table.

ans =

  5x4 table

     Region         OutageTime        Loss     Customers 
    _________    ________________    ______    __________

    SouthWest    2002-02-01 12:18    458.98    1.8202e+06
    SouthEast    2003-01-23 00:49    530.14    2.1204e+05
    SouthEast    2003-02-07 21:15     289.4    1.4294e+05
    West         2004-04-06 05:44    434.81    3.4037e+05
    MidWest      2002-03-16 06:18    186.44    2.1275e+05

The type of data contained in the table is mixed. The first and last variables are categorical arrays, the second and fifth variables are datetime arrays, and the remaining variables are numeric data.

Modify Imported Data

Modify the format of the datetime columns in T.

T.OutageTime.Format = 'dd-MMM-yyyy HH:mm:ss';
T.RestorationTime.Format = 'dd-MMM-yyyy HH:mm:ss';

View the first five rows and first four variables of the table.

ans =

  5x4 table

     Region           OutageTime          Loss     Customers 
    _________    ____________________    ______    __________

    SouthWest    01-Feb-2002 12:18:00    458.98    1.8202e+06
    SouthEast    23-Jan-2003 00:49:00    530.14    2.1204e+05
    SouthEast    07-Feb-2003 21:15:00     289.4    1.4294e+05
    West         06-Apr-2004 05:44:00    434.81    3.4037e+05
    MidWest      16-Mar-2002 06:18:00    186.44    2.1275e+05

Append to Imported Data

Calculate the duration of each electrical outage and append the data to the table.

T.Duration = T.RestorationTime - T.OutageTime;

View the first five rows of the data in the Duration column of T.

ans = 

  5x1 duration array


Sort Imported Data

Sort the table by the OutageTime variable. Then, view the first five rows and first four variables of the sorted table.

T = sortrows(T,'OutageTime','ascend');
ans =

  5x4 table

     Region           OutageTime          Loss     Customers 
    _________    ____________________    ______    __________

    SouthWest    01-Feb-2002 12:18:00    458.98    1.8202e+06
    MidWest      05-Mar-2002 17:53:00    96.563    2.8666e+05
    MidWest      16-Mar-2002 06:18:00    186.44    2.1275e+05
    MidWest      26-Mar-2002 01:59:00    388.04    5.6422e+05
    MidWest      20-Apr-2002 16:46:00     23141           NaN

See Also

Related Topics

Was this topic helpful?