Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

write

Write tall array to disk for checkpointing

Syntax

write(location,tA)

Description

example

write(location,tA) calculates the values in tall array tA and then writes the array to files in the folder specified by location. The data is stored in an efficient binary format suitable for reading back using datastore(location).

Examples

collapse all

Write a tall array to disk, then subsequently recover the tall array by creating a new datastore for the written files. This process is useful to save your work or share a tall array with a colleague.

Create a datastore for the airlinesmall.csv data set. Select only the Year, Month, and UniqueCarrier variables, and treat 'NA' values as missing data. Convert the datastore into a tall table.

ds = datastore('airlinesmall.csv');
ds.TreatAsMissing = 'NA';
ds.SelectedVariableNames = {'Month','Year','UniqueCarrier'};
tt = tall(ds)
tt =

  M×3 tall table 

    Month    Year    UniqueCarrier
    _____    ____    _____________

    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    10       1987    'PS'         
    :        :       :
    :        :       :

Sort the data in descending order by year and extract the top 25 rows. The resulting tall table is unevaluated.

tt_new = topkrows(tt,25,'Year')
tt_new =

  M×3 tall table 

    Month    Year    UniqueCarrier
    _____    ____    _____________

    ?        ?       ?            
    ?        ?       ?            
    ?        ?       ?            
    :        :       :
    :        :       :

Save the results to a new folder named ExampleData on the C:\ disk. (You might want to specify a different write location, especially if you are not using a Windows® computer.) The write function evaluates the tall array prior to writing the files, so there is no need to use the gather function prior to saving the data.

location = 'C:\ExampleData';
write(location,tt_new)
Writing tall data to folder C:\ExampleData
Evaluating tall expression using the Local MATLAB Session:
- Pass 1 of 1: Completed in 0 sec
Evaluation completed in 0 sec

Clear tt and ds from your working directory. To recover the tall table that was written to disk, first create a new datastore that references the same directory. Then convert the datastore into a tall table. Since the tall table was evaluated before being written to disk, the display now includes a preview of the values.

clear tt ds
ds2 = datastore(location);
tt2 = tall(ds2)
tt2 =

  M×3 tall table 

    Month    Year    UniqueCarrier
    _____    ____    _____________

    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    1        2008    'WN'         
    :        :       :
    :        :       :

Input Arguments

collapse all

Folder location to write data, specified as a character vector or string. location can specify a full or relative path. The specified folder can be either of these options:

  • Existing empty folder that contains no other files

  • New folder that write creates

Additional considerations apply for Hadoop® and Apache Spark™:

  • If the folder is not available locally, then the full path of the folder must be an internationalized resource identifier (IRI) of the form:
    hdfs:///path_to_file. For more information see, Read Remote Data.

  • Before writing to HDFS™, set the HADOOP_HOME, HADOOP_PREFIX, or MATLAB_HADOOP_INSTALL environment variable to the folder where Hadoop is installed.

  • Before writing to Apache Spark, set the SPARK_HOME environment variable to the folder where Apache Spark is installed.

Example: location = 'hdfs:///some/output/folder'

Example: location = '../../dir/data'

Example: location = 'C:\Users\MyName\Desktop'

Data Types: char | string

Input array, specified as a tall array.

Tips

  • Use the write function to create checkpoints or snapshots of your data as you work, especially when working with huge data sets. This practice allows you to reconstruct tall arrays directly from files on disk rather than reexecuting all of the commands that produced the tall array.

See Also

|

Introduced in R2016b

Was this topic helpful?