MATLAB Examples

Import Large Data Using DatabaseDatastore Object

This example shows how to create a DatabaseDatastore object for accessing collections of data stored in a relational database. After creating a DatabaseDatastore object, you can preview data, read data in chunks, and read every record in the data set.

To analyze large data, you can run algorithms on large data sets using a tall array.

Alternatively, you can write a MapReduce algorithm that defines the chunking and reduction of the data.

Contents

Create DatabaseDatastore Object

Using a JDBC driver, create a database connection to a Microsoft® SQL Server® database with Windows® authentication. Specify a blank user name and password. Here, the code assumes that you are connecting to database toy_store, database server dbtb04, and port number 54317.

conn = database('toy_store','','','Vendor','Microsoft SQL Server',...
    'Server','dbtb04','PortNumber',54317,'AuthType','Windows');

Create a DatabaseDatastore object using the database connection and SQL query. This SQL query retrieves all data from the table.

sqlquery = 'select * from airlinesmall';

dbds = databaseDatastore(conn,sqlquery);

Preview Data in DatabaseDatastore Object

Preview the first eight records in the data set returned by executing sqlquery.

preview(dbds)
ans = 

    Year    Month    DayofMonth    DayOfWeek    DepTime    CRSDepTime    ArrTime    CRSArrTime    UniqueCarrier    FlightNum    TailNum    ActualElapsedTime    CRSElapsedTime    AirTime    ArrDelay    DepDelay    Origin    Dest     Distance    TaxiIn    TaxiOut    Cancelled    CancellationCode    Diverted    CarrierDelay    WeatherDelay    NASDelay    SecurityDelay    LateAircraftDelay
    ____    _____    __________    _________    _______    __________    _______    __________    _____________    _________    _______    _________________    ______________    _______    ________    ________    ______    _____    ________    ______    _______    _________    ________________    ________    ____________    ____________    ________    _____________    _________________

    1990    9        22            6            1801       1750          2005       1938          'NW'             209          'NA'       124                  108               'NA'        27          11         'PHL'     'DTW'     453        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1990    9        11            2             908        910          1613       1554          'NW'             248          'NA'       245                  224               'NA'        19          -2         'PHX'     'DTW'    1671        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1990    9         2            7             NaN       1805           NaN       1900          'NW'             284          'NA'       NaN                   55               'NA'       NaN         NaN         'JAN'     'MEM'     189        'NA'      'NA'       1            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1990    9        29            6            1434       1435          1615       1630          'NW'             305          'NA'       221                  235               'NA'       -15          -1         'MSP'     'LAX'    1536        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1990    9         3            1             925        755          1258       1144          'NW'             350          'NA'       153                  169               'NA'        74          90         'MSP'     'BOS'    1124        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1990    9        22            6             900        900          1241       1222          'AA'              11          'NA'       401                  382               'NA'        19           0         'BOS'     'LAX'    2611        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1990    9        20            4            1338       1335          1853       1907          'AA'              62          'NA'       255                  272               'NA'       -14           3         'ORD'     'SJU'    2072        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1990    9         3            1             710        711           837        847          'AA'             101          'NA'       147                  156               'NA'       -10          -1         'DTW'     'DFW'     987        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             

Read Data in DatabaseDatastore Object

Read the first 10 records.

dbds.ReadSize = 10;

read(dbds)
ans = 

    Year    Month    DayofMonth    DayOfWeek    DepTime    CRSDepTime    ArrTime    CRSArrTime    UniqueCarrier    FlightNum    TailNum    ActualElapsedTime    CRSElapsedTime    AirTime    ArrDelay    DepDelay    Origin    Dest     Distance    TaxiIn    TaxiOut    Cancelled    CancellationCode    Diverted    CarrierDelay    WeatherDelay    NASDelay    SecurityDelay    LateAircraftDelay
    ____    _____    __________    _________    _______    __________    _______    __________    _____________    _________    _______    _________________    ______________    _______    ________    ________    ______    _____    ________    ______    _______    _________    ________________    ________    ____________    ____________    ________    _____________    _________________

    1987    10       28            3            1140       1140          1212       1215          'US'             262          'NA'       32                    35               'NA'        -3          0          'CLE'     'PIT'    105         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10        9            5            1155       1155          1250       1300          'US'             282          'NA'       55                    65               'NA'       -10          0          'ROC'     'LGA'    254         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       22            4             715        715           807        803          'US'             304          'NA'       52                    48               'NA'         4          0          'DTW'     'PIT'    201         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       16            5            1553       1555          1641       1640          'US'             323          'NA'       48                    45               'NA'         1         -2          'BUF'     'PIT'    186         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       30            5            1821       1815          1956       1955          'US'             346          'NA'       95                   100               'NA'         1          6          'IND'     'EWR'    644         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       12            1            1300       1300          1529       1528          'US'             370          'NA'       89                    88               'NA'         1          0          'STL'     'PIT'    553         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10        7            3             810        810           904        900          'US'             387          'NA'       54                    50               'NA'         4          0          'ELM'     'PIT'    208         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       19            1             733        735           827        831          'US'             407          'NA'       54                    56               'NA'        -4         -2          'ROC'     'CLE'    245         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       15            4             828        830           916        921          'US'             424          'NA'       48                    51               'NA'        -5         -2          'ORF'     'PHL'    211         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10        4            7            1750       1735          1837       1816          'US'             449          'NA'       47                    41               'NA'        21         15          'PIT'     'CLE'    105         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             

Read the DatabaseDatastore object two more times by using counter n. Read 10 records at a time.

n = 0;

while(hasdata(dbds) && n~=2)
     read(dbds)
     n = n+1;
end
ans = 

    Year    Month    DayofMonth    DayOfWeek    DepTime    CRSDepTime    ArrTime    CRSArrTime    UniqueCarrier    FlightNum    TailNum    ActualElapsedTime    CRSElapsedTime    AirTime    ArrDelay    DepDelay    Origin    Dest     Distance    TaxiIn    TaxiOut    Cancelled    CancellationCode    Diverted    CarrierDelay    WeatherDelay    NASDelay    SecurityDelay    LateAircraftDelay
    ____    _____    __________    _________    _______    __________    _______    __________    _____________    _________    _______    _________________    ______________    _______    ________    ________    ______    _____    ________    ______    _______    _________    ________________    ________    ____________    ____________    ________    _____________    _________________

    1987    10       16            5             959       1000          1212       1215          'US'             472          'NA'        73                   75               'NA'       -3          -1          'BNA'     'PIT'     462        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       17            6            2020       2020          2100       2057          'US'             490          'NA'        40                   37               'NA'        3           0          'PIT'     'ERI'     109        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10        6            2            1132       1135          1426       1419          'US'             516          'NA'       114                  104               'NA'        7          -3          'MSP'     'PIT'     726        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       24            6             944        945          1211       1213          'US'             535          'NA'        87                   88               'NA'       -2          -1          'STL'     'PIT'     553        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       18            7             833        835          1003       1011          'US'             571          'NA'        90                   96               'NA'       -8          -2          'PIT'     'ATL'     526        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       26            1            2356       2355           730        721          'US'             604          'NA'       274                  266               'NA'        9           1          'SFO'     'PIT'    2254        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       29            4            1056       1055          1208       1215          'US'             627          'NA'        72                   80               'NA'       -7           1          'EWR'     'CLE'     404        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10        1            4            2304       2255          2340       2329          'US'             655          'NA'        36                   34               'NA'       11           9          'PBI'     'RSW'     103        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       30            5            1329       1329          1434       1436          'US'             683          'NA'        65                   67               'NA'       -2           0          'LGA'     'BUF'     292        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10        3            6            1040       1040          1125       1120          'AS'              51          'NA'        45                   40               'NA'        5           0          'OME'     'OTZ'     183        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             


ans = 

    Year    Month    DayofMonth    DayOfWeek    DepTime    CRSDepTime    ArrTime    CRSArrTime    UniqueCarrier    FlightNum    TailNum    ActualElapsedTime    CRSElapsedTime    AirTime    ArrDelay    DepDelay    Origin    Dest     Distance    TaxiIn    TaxiOut    Cancelled    CancellationCode    Diverted    CarrierDelay    WeatherDelay    NASDelay    SecurityDelay    LateAircraftDelay
    ____    _____    __________    _________    _______    __________    _______    __________    _____________    _________    _______    _________________    ______________    _______    ________    ________    ______    _____    ________    ______    _______    _________    ________________    ________    ____________    ____________    ________    _____________    _________________

    1987    10       23            5            1855       1855          2158       2205          'AS'              66          'NA'       123                  130               'NA'        -7           0         'JNU'     'SEA'     909        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       30            5            1055       1055          1302       1315          'AS'              91          'NA'       187                  200               'NA'       -13           0         'SEA'     'ANC'    1449        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       28            3             NaN       1850           NaN       2050          'AS'             116          'NA'       NaN                  120               'NA'       NaN         NaN         'PDX'     'BUR'     817        'NA'      'NA'       1            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       26            1            1600       1600          1649       1640          'AS'             172          'NA'        49                   40               'NA'         9           0         'LAX'     'SAN'     109        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10        6            2             745        745           833        825          'AS'             189          'NA'        48                   40               'NA'         8           0         'TUS'     'PHX'     110        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       31            6            1350       1350          1612       1610          'AS'             395          'NA'       202                  200               'NA'         2           0         'SEA'     'ANC'    1449        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       12            1            1253       1200          1359       1310          'AS'             747          'NA'        66                   70               'NA'        49          53         'BUR'     'SFO'     326        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       19            1             650        645           852        815          'CO'              63          'NA'       182                  150               'NA'        37           5         'ORD'     'DEN'     888        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       10            6            1640       1640          1712       1724          'CO'              89          'NA'        32                   44               'NA'       -12           0         'MCO'     'TPA'      80        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10        2            5            2030       2030          2127       2133          'CO'             111          'NA'        57                   63               'NA'        -6           0         'IAH'     'CRP'     201        'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             

Reset DatabaseDatastore Object

Reset the DatabaseDatastore object to the state where no data has been read from it. Resetting allows rereading from the same DatabaseDatastore object.

reset(dbds)

Read Every Record in DatabaseDatastore Object

Read every record in the DatabaseDatastore object in increments of 50,000 records at a time.

dbds.ReadSize = 50000;
data = readall(dbds);

Display the first three records of the full data set.

data(1:3,:)
ans = 

    Year    Month    DayofMonth    DayOfWeek    DepTime    CRSDepTime    ArrTime    CRSArrTime    UniqueCarrier    FlightNum    TailNum    ActualElapsedTime    CRSElapsedTime    AirTime    ArrDelay    DepDelay    Origin    Dest     Distance    TaxiIn    TaxiOut    Cancelled    CancellationCode    Diverted    CarrierDelay    WeatherDelay    NASDelay    SecurityDelay    LateAircraftDelay
    ____    _____    __________    _________    _______    __________    _______    __________    _____________    _________    _______    _________________    ______________    _______    ________    ________    ______    _____    ________    ______    _______    _________    ________________    ________    ____________    ____________    ________    _____________    _________________

    1987    10       28            3            1140       1140          1212       1215          'US'             262          'NA'       32                   35                'NA'        -3         0           'CLE'     'PIT'    105         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10        9            5            1155       1155          1250       1300          'US'             282          'NA'       55                   65                'NA'       -10         0           'ROC'     'LGA'    254         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             
    1987    10       22            4             715        715           807        803          'US'             304          'NA'       52                   48                'NA'         4         0           'DTW'     'PIT'    201         'NA'      'NA'       0            'NA'                0           'NA'            'NA'            'NA'        'NA'             'NA'             

Close DatabaseDatastore Object and Database Connection

close(dbds)