Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

tall

Create tall array

Syntax

t = tall(ds)
t = tall(A)

Description

example

t = tall(ds) creates a tall array on top of datastore ds.

  • If ds is a datastore for tabular data (so that the read and readall methods of datastore return tables), then t is a tall table. Tabular data is data that is arranged in a rectangular fashion with each row having the same number of entries.

  • Otherwise, t is a tall cell array.

example

t = tall(A) converts the in-memory array A into a tall array. The underlying data type of t is the same as class(A).

Examples

collapse all

Convert a datastore into a tall array.

First, create a datastore for the data set. You can specify either a full or relative file location for the data set using datastore(location) to create the datastore. The location argument can specify:

  • A single file, such as 'airlinesmall.csv'

  • Several files with the same extension, such as '*.csv'

  • An entire folder of files, such as 'C:\MyData'

datastore also has several options to specify file and text format properties when you create the datastore.

Create a datastore for the airlinesmall.csv data set. Treat 'NA' values as missing data so that they are replaced with NaN values. Select a small subset of the variables to work with.

varnames = {'ArrDelay', 'DepDelay', 'Origin', 'Dest'};
ds = datastore('airlinesmall.csv', 'TreatAsMissing', 'NA', ...
    'SelectedVariableNames', varnames);

Use tall to create a tall array for the data in the datastore. Since the data in ds is tabular, the result is a tall table. If the data is not tabular, then tall creates a tall cell array instead.

T = tall(ds)
T =

  Mx4 tall table

    ArrDelay    DepDelay    Origin    Dest 
    ________    ________    ______    _____

     8          12          'LAX'     'SJC'
     8           1          'SJC'     'BUR'
    21          20          'SAN'     'SMF'
    13          12          'BUR'     'SJC'
     4          -1          'SMF'     'LAX'
    59          63          'LAX'     'SJC'
     3          -2          'SAN'     'SFO'
    11          -1          'SEA'     'LAX'
    :           :           :         :
    :           :           :         :

You can use many common MATLAB® operators and functions to work with tall arrays. For a list of supported functions, see:

Convert a datastore into a tall table, calculate its size using a deferred calculation, and then perform the calculation and return the result in memory.

First, create a datastore for the airlinesmall.csv data set. Treat 'NA' values as missing data so that they are replaced with NaN values. Set the text format of a few columns so that they are read as a cell array of character vectors. Convert the datastore into a tall table.

ds = datastore('airlinesmall.csv', 'TreatAsMissing', 'NA');
ds.SelectedFormats{strcmp(ds.SelectedVariableNames, 'TailNum')} = '%s';
ds.SelectedFormats{strcmp(ds.SelectedVariableNames, 'CancellationCode')} = '%s';
T = tall(ds)
T =

  Mx29 tall table

    Year    Month    DayofMonth    DayOfWeek    DepTime    CRSDepTime    ArrTime    CRSArrTime    UniqueCarrier    FlightNum    TailNum    ActualElapsedTime    CRSElapsedTime    AirTime    ArrDelay    DepDelay    Origin    Dest     Distance    TaxiIn    TaxiOut    Cancelled    CancellationCode    Diverted    CarrierDelay    WeatherDelay    NASDelay    SecurityDelay    LateAircraftDelay
    ____    _____    __________    _________    _______    __________    _______    __________    _____________    _________    _______    _________________    ______________    _______    ________    ________    ______    _____    ________    ______    _______    _________    ________________    ________    ____________    ____________    ________    _____________    _________________

    1987    10       21            3             642        630           735        727          'PS'             1503         'NA'        53                   57               NaN         8          12          'LAX'     'SJC'    308         NaN       NaN        0            'NA'                0           NaN             NaN             NaN         NaN              NaN              
    1987    10       26            1            1021       1020          1124       1116          'PS'             1550         'NA'        63                   56               NaN         8           1          'SJC'     'BUR'    296         NaN       NaN        0            'NA'                0           NaN             NaN             NaN         NaN              NaN              
    1987    10       23            5            2055       2035          2218       2157          'PS'             1589         'NA'        83                   82               NaN        21          20          'SAN'     'SMF'    480         NaN       NaN        0            'NA'                0           NaN             NaN             NaN         NaN              NaN              
    1987    10       23            5            1332       1320          1431       1418          'PS'             1655         'NA'        59                   58               NaN        13          12          'BUR'     'SJC'    296         NaN       NaN        0            'NA'                0           NaN             NaN             NaN         NaN              NaN              
    1987    10       22            4             629        630           746        742          'PS'             1702         'NA'        77                   72               NaN         4          -1          'SMF'     'LAX'    373         NaN       NaN        0            'NA'                0           NaN             NaN             NaN         NaN              NaN              
    1987    10       28            3            1446       1343          1547       1448          'PS'             1729         'NA'        61                   65               NaN        59          63          'LAX'     'SJC'    308         NaN       NaN        0            'NA'                0           NaN             NaN             NaN         NaN              NaN              
    1987    10        8            4             928        930          1052       1049          'PS'             1763         'NA'        84                   79               NaN         3          -2          'SAN'     'SFO'    447         NaN       NaN        0            'NA'                0           NaN             NaN             NaN         NaN              NaN              
    1987    10       10            6             859        900          1134       1123          'PS'             1800         'NA'       155                  143               NaN        11          -1          'SEA'     'LAX'    954         NaN       NaN        0            'NA'                0           NaN             NaN             NaN         NaN              NaN              
    :       :        :             :            :          :             :          :             :                :            :          :                    :                 :          :           :           :         :        :           :         :          :            :                   :           :               :               :           :                :
    :       :        :             :            :          :             :          :             :                :            :          :                    :                 :          :           :           :         :        :           :         :          :            :                   :           :               :               :           :                :

The display of the tall table indicates that MATLAB® does not yet know how many rows of data are in the table.

Calculate the size of the tall table. Since calculating the size of a tall array requires a full pass through the data, MATLAB does not immediately calculate the value. Instead, like most operations with tall arrays, the result is an unevaluated tall array whose values and size are currently unknown.

s = size(T)
s =

  1x2 tall double row vector

    ?    ?

Use the gather function to perform the deferred calculation and return the result in memory. The result returned by size is a trivially small 1-by-2 vector, which fits in memory.

sz = gather(s)
Evaluating tall expression using the Local MATLAB Session:
- Pass 1 of 1: Completed in 2 sec
Evaluation completed in 2 sec
sz = 

      123523          29

If you use gather on an unreduced tall array, then the result might not fit in memory. If you are unsure whether the result returned by gather can fit in memory, use gather(head(X)) or gather(tail(X)) to bring only a small portion of the calculation result into memory.

Create an in-memory array of random numbers, and then convert it into a tall array. Creating tall arrays from in-memory arrays is useful for debugging or prototyping new programs.

A = rand(100,4);
tA = tall(A)
tA =

  100x4 tall double matrix

    0.8147    0.1622    0.6443    0.0596
    0.9058    0.7943    0.3786    0.6820
    0.1270    0.3112    0.8116    0.0424
    0.9134    0.5285    0.5328    0.0714
    0.6324    0.1656    0.3507    0.5216
    0.0975    0.6020    0.9390    0.0967
    0.2785    0.2630    0.8759    0.8181
    0.5469    0.6541    0.5502    0.8175
    :         :         :         :
    :         :         :         :

Input Arguments

collapse all

Input datastore, specified as a datastore object. Use the datastore function to create a datastore object for your data set.

Example: ds = datastore('airlinesmall.csv') specifies a single file.

Example: ds = datastore('*.csv') specifies a collection of .csv files.

Example: ds = datastore('C:\MyData') specifies a folder of files.

Example: ds = datastore('hdfs:///data/') specifies a data set in an HDFS file system.

In-memory variable, specified as an array.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | logical | table | string | cell | categorical | datetime | duration | calendarDuration
Complex Number Support: Yes

Output Arguments

collapse all

Tall array.

  • When converting a datastore, t is a tall table for tabular datastores. Otherwise, t is a tall cell array.

  • When converting an in-memory array, the underlying data type of t is the same as class(A).

Tips

  • See Extend Tall Arrays with Other Products for information on how to use tall arrays with:

    • Statistics and Machine Learning Toolbox™

    • Parallel Computing Toolbox™

    • MATLAB® Distributed Computing Server™

    • Database Toolbox™

    • MATLAB Compiler™

Introduced in R2016b

Was this topic helpful?