This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Working with Non-ASCII Characters in HDF5 Files

To enable sharing of HDF5 files across multiple locales, MATLAB® supports the use of non-ASCII characters in HDF5 files. This example shows you how to:

  • Create HDF5 files containing dataset and attribute names that have non-ASCII characters using the high-level functions.

  • Create variable-length string datasets containing non-ASCII characters using the low-level functions.

Create Dataset and Attribute Names Containing Non-ASCII Characters

Create an HDF5 file containing a dataset name and an attribute name that contains non-ASCII characters. To check if the dataset and attribute names appear as expected, write data to the dataset, and display the file information.

Create a dataset with a name (/数据集) that includes non-ASCII characters.

dsetName = ['/' char([25968 25454 38598])];
dsetDims = [5 2];
h5create('outfile.h5',['/grp1' dsetName],dsetDims,...
                                'TextEncoding','UTF-8');
Write data to the file.
dataToWrite = rand(dsetDims);
h5write('outfile.h5',['/grp1' dsetName],dataToWrite);

Create an attribute name (屬性名稱) that includes non-ASCII characters and assign a value to the attribute.

attrName = char([25967 25453 38597]);
h5writeatt('outfile.h5','/',attrName,'I am an attribute',...
                                      'TextEncoding','UTF-8');

Display information about the file and check if the attribute name and dataset name appear correctly.

h5disp('outfile.h5')
HDF5 outfile.h5 
Group '/' 
    Attributes:
        '/屬性名稱':  'I am an attribute'
    Group '/grp1' 
        Dataset '数据集' 
            Size:  5x2
            MaxSize:  5x2
            Datatype:   H5T_IEEE_F64LE (double)
            ChunkSize:  []
            Filters:  none
            FillValue:  0.000000

Create Variable-Length String Data Containing Non-ASCII Characters

Create a variable-length string dataset to store data containing non-ASCII characters using the low-level functions. Write the data to the dataset. Check if the data is written correctly.

Create data containing non-ASCII characters.

dataToWrite = {char([12487 12540 12479]) 'hello' ...
                   char([1605 1585 1581 1576 1575]); ...
               'world' char([1052 1080 1088])    ...
                   char([954 972 963 956 959 962])};
disp(dataToWrite)
    'データ'    'hello'    'مرحبا' 
    'world'    'Мир'      'κόσμος'

To write this data into a file, create an HDF5 file, define a group name, and a dataset name within the group.

Create the HDF5 file.

fileName = 'outfile.h5';
fileID = H5F.create(fileName,'H5F_ACC_TRUNC',...
                     'H5P_DEFAULT', 'H5P_DEFAULT');

To create the group containing non-ASCII characters in its name, first, configure the link creation property.

lcplID = H5P.create('H5P_LINK_CREATE'); 
H5P.set_char_encoding(lcplID,H5ML.get_constant_value('H5T_CSET_UTF8'));
plist = 'H5P_DEFAULT';

Then, create the group (グループ).

grpName = char([12464 12523 12540 12503]);
grpID = H5G.create(fileID,grpName,lcplID,plist,plist);

Create a dataset that contains variable-length string data with non-ASCII characters. First, configure its data type.

typeID = H5T.copy('H5T_C_S1');
H5T.set_size(typeID,'H5T_VARIABLE');
H5T.set_cset(typeID,H5ML.get_constant_value('H5T_CSET_UTF8'));

Now create the dataset by specifying its name, data type, and dimensions.

dsetName = 'datasetUtf8';
dataDims = [2 3];
h5DataDims = fliplr(dataDims);
h5MaxDims = h5DataDims;
spaceID = H5S.create_simple(2,h5DataDims,h5MaxDims);
dsetID = H5D.create(grpID,dsetName,typeID,spaceID,...
             'H5P_DEFAULT','H5P_DEFAULT','H5P_DEFAULT');

Write the data to the dataset.

H5D.write(dsetID,'H5ML_DEFAULT','H5S_ALL',...
               'H5S_ALL','H5P_DEFAULT',dataToWrite);

Read the data back.

dataRead = h5read('outfile.h5',['/' grpName '/' dsetName])
dataRead =

  2×3 cell array

    {'データ'}    {'hello'}    {'مرحبا' }
    {'world'}    {'Мир'  }    {'κόσμος'}

Check if data in the file matches the written data.

isequal(dataRead,dataToWrite)
ans =

  logical

   1

Close ids.

H5D.close(dsetID);
H5S.close(spaceID);
H5T.close(typeID);
H5G.close(grpID);
H5P.close(lcplID);
H5F.close(fileID);

See Also

| | | | | | | |