Importing CDF Files

Overview

CDF was created by the National Space Science Data Center (NSSDC) to provide a self-describing data storage and manipulation format that matches the structure of scientific data and applications (i.e., statistical and numerical methods, visualization, and management). For more information about this format, see the CDF Web site.

MATLAB® provides two ways to access CDF files: a set of high-level functions and a package of low-level functions that provide direct access to the routines in the CDF C API library. The high level functions provide a simpler interface to accessing CDF files. However, if you require more control over the import operation, such as data subsetting for large data sets, use the low-level functions.

High-Level CDF Import Functions

MATLAB includes high-level functions that you can use to get information about the contents of a Common Data Format (CDF) file and then read data from the file. The following sections provide more information.

Getting Information about the Contents of CDF File

To get information about the contents of a CDF file, such as the names of variables in the CDF file, use the cdfinfo function. The cdfinfo function returns a structure containing general information about the file and detailed information about the variables and attributes in the file.

In this example, the Variables field indicates the number of variables in the file. Taking a closer look at the contents of this field, you can see that the first variable, Time, is made up of 24 records containing CDF epoch data. The next two variables, Longitude and Latitude, have only one associated record containing int8 data. For details about how to interpret the data returned in the Variables field, see cdfinfo.

    Note   Because cdfinfo creates temporary files, make sure that your current working directory is writable before attempting to use the function.

info = cdfinfo('example.cdf')

info = 

              Filename: 'example.cdf'
           FileModDate: '19-May-2010 12:03:11'
              FileSize: 1310
                Format: 'CDF'
         FormatVersion: '2.7.0'
          FileSettings: [1x1 struct]
              Subfiles: {}
             Variables: {6x6 cell}
      GlobalAttributes: [1x1 struct]
    VariableAttributes: [1x1 struct]

vars = info.Variables

vars = 

    'Time'                [1x2 double]    [24]    'epoch'     'T/'        'Full'
    'Longitude'           [1x2 double]    [ 1]    'int8'      'F/FT'      'Full'
    'Latitude'            [1x2 double]    [ 1]    'int8'      'F/TF'      'Full'
    'Data'                [1x3 double]    [ 1]    'double'    'T/TTT'     'Full'
    'multidimensional'    [1x4 double]    [ 1]    'uint8'     'T/TTTT'    'Full'
    'Temperature'         [1x2 double]    [10]    'int16'     'T/TT'      'Full'

Reading Data from a CDF File

To read all of the data in the CDF file, use the cdfread function. The function returns the data in a cell array. The columns of data correspond to the variables; the rows correspond to the records associated with a variable.

data = cdfread('example.cdf');

whos data
  Name       Size            Bytes  Class    Attributes

  data       24x6             16512  cell      

To read the data associated with one or more particular variables, use the 'Variable' parameter. Specify the names of the variables as text strings in a cell array. Variable names are case sensitive. The following example reads the Longitude and Latitude variables from the file.

var_long_lat = cdfread('example.cdf','Variable',{'Longitude','Latitude'});

whos var_long_lat
Name             Size            Bytes  Class    Attributes

var_long_lat     1x2              128    cell               

Speeding Up Read Operations

The cdfread function offers two ways to speed up read operations when working with large data sets:

  • Reducing the number of elements in the returned cell array

  • Returning CDF epoch values as MATLAB serial date numbers rather than as MATLAB cdfepoch objects

To reduce the number of elements in the returned cell array, specify the 'CombineRecords' parameter. By default, cdfread creates a cell array with a separate element for every variable and every record in each variable, padding the records dimension to create a rectangular cell array. For example, reading all the data from the example file produces an output cell array, 24-by-6, where the columns represent variables and the rows represent the records for each variable. When you set the 'CombineRecords' parameter to true, cdfread creates a separate element for each variable but saves time by putting all the records associated with a variable in a single cell array element. Thus, reading the data from the example file with 'CombineRecords' set to true produces a 1-by-5 cell array, as shown below.

data_combined = cdfread('example.cdf','CombineRecords',true);

whos
  Name                Size            Bytes  Class    Attributes

  data               24x6             16512  cell               
  data_combined       1x6              2544  cell               

When combining records, note that the dimensions of the data in the cell change. For example, if a variable has 20 records, each of which is a scalar value, the data in the cell array for the combined element contains a 20-by-1 vector of values. If each record is a 3-by-4 array, the cell array element contains a 20-by-3-by-4 array. For combined data, cdfread adds a dimension to the data, the first dimension, that is the index into the records.

Another way to speed up read operations is to read CDF epoch values as MATLAB serial date numbers. By default, cdfread creates a MATLAB cdfepoch object for each CDF epoch value in the file. If you specify the 'ConvertEpochToDatenum' parameter, setting it to true, cdfread returns CDF epoch values as MATLAB serial date numbers. For more information about working with MATLAB cdfepoch objects, see Representing CDF Time Values.

data_datenums = cdfread('example.cdf','ConvertEpochToDatenum',true);

whos
  Name                Size            Bytes  Class    Attributes

  data               24x6             16512  cell                
  data_combined       1x6              2544  cell                
  data_datenums      24x6             13536  cell    

Representing CDF Time Values

CDF represents time differently than MATLAB. CDF represents date and time as the number of milliseconds since 1-Jan-0000. This is called an epoch in CDF terminology. To represent CDF dates, MATLAB uses an object called a CDF epoch object. MATLAB also can represent a date and time as a datetime value or as a serial date number, which is the number of days since 0-Jan-0000. To access the time information in a CDF object, convert to one of these other representations using the object's todatenum method.

For example, this code extracts the date information from a CDF epoch object:

  1. Extract the date information from the CDF epoch object returned in the cell array, data. Use the todatenum method of the CDF epoch object to get the date information, which is returned as a MATLAB serial date number.

    m_date = todatenum(data{1});
  2. Optionally, convert the MATLAB serial date number to a datetime value.

    datetime(m_date,'ConvertFrom','datenum')
    ans =
    
    01-Jan-2001 00:00:00

Using the CDF Library Low-Level Functions to Import Data

To import (read) data from a Common Data Format (CDF) file, you can use the MATLAB low-level CDF functions. The MATLAB functions correspond to dozens of routines in the CDF C API library. For a complete list of all the MATLAB low-level CDF functions, see cdflib.

This section does not attempt to describe all features of the CDF library or explain basic CDF programming concepts. To use the MATLAB CDF low-level functions effectively, you must be familiar with the CDF C interface. Documentation about CDF, version 3.3.0, is available at the CDF Web site.

The following example shows how to use low-level functions to read data from a CDF file.

  1. Open the sample CDF file. For information about creating a new CDF file, see Exporting to CDF Files.

    cdfid = cdflib.open('example.cdf');
    
  2. Get some information about the contents of the file, such as the number of variables in the file, the number of global attributes, and the number of attributes with variable scope.

    info = cdflib.inquire(cdfid)
    
    info = 
    
         encoding: 'IBMPC_ENCODING'
         majority: 'ROW_MAJOR'
           maxRec: 23
          numVars: 6
        numvAttrs: 1
        numgAttrs: 3
    
  3. Get information about the individual variables in the file. Variable ID numbers start at zero.

    info  = cdflib.inquireVar(cdfid,0)
    
    info = 
    
               name: 'Time'
           datatype: 'cdf_epoch'
        numElements: 1
               dims: []
        recVariance: 1
        dimVariance: [] 
    
    info  = cdflib.inquireVar(cdfid,1)
    
    info = 
    
               name: 'Longitude'
           datatype: 'cdf_int1'
        numElements: 1
               dims: [2 2]
        recVariance: 0
        dimVariance: [1 0]
  4. Read the data in a variable into the workspace. The first variable contains CDF Epoch time values. The low-level interface returns these as double values.

    data_time = cdflib.getVarRecordData(cdfid,0,0)
    
    data_time =
    
      6.3146e+013
    
    % convert the time value to a time vector
    timeVec = cdflib.epochBreakdown(data_time)
    
    timeVec =
    
            2001
               1
               1
               0
               0
               0
               0
  5. Read a global attribute from the file.

    % Determine which attributes are global.
    info = cdflib.inquireAttr(cdfid,0)
    
    info = 
    
             name: 'SampleAttribute'
            scope: 'GLOBAL_SCOPE'
        maxgEntry: 4
         maxEntry: -1
    
    % Read the value of the attribute. Note you must use the 
    % cdflib.getAttrgEntry function for global attributes.
    value = cdflib.getAttrgEntry(cdfid,0,0)
    
    value =
    
    This is a sample entry.
    
  6. Close the CDF file.

    cdflib.close(cdfid);
    
Was this topic helpful?