HDF Compound Data and "strings"

18 views (last 30 days)
plasmageek
plasmageek on 4 Feb 2025
Answered: Prathamesh on 17 Feb 2025
Hi,
I'm migrating code to Matlab. The data format from the previous language writes out an HDF file, with a compound datatype. This is not the way I would choose to do this in Matlab, I love the higher level functions and almost never touch the lower level ones. However, I'd like to make the new code backwards compatible with previous data which means I also need the compound datatype. It includes strings. I know HDF supports this, but it's unclear that Matlab supports putting this into an HDF file.
The error message I get is:
Error using hdf5lib2
The class of input data must be integer instead of char when the HDF5 class is H5T_STD_I8LE.
Error in H5D.write (line 100)
H5ML.hdf5lib2('H5Dwrite', varargin{:});
Error in hdfhell2p0 (line 53)
H5D.write(dsetid, cid, 'H5S_ALL', 'H5S_ALL', 'H5P_DEFAULT', params);
I can only assume it's upset by the char arrays/strings. I've tried different versions of creating the strings. I found a piece that said they had to be characters, not strings, so I converted them. I found another that said they had to be uniform in length, so I set a specific length. This one doesn't make sense to me since I'm only doinn a single value, but hey, grasping at straws.
If i remove the strings/charcters, it works.
So, is there a way to keep the strings and if so, what am I missing?
%%make up a mixed structure
params.floc = 'C:\Folder1';
params.fname = 'fname.txt';
params.value = 10;
%%actual code
outname = 'test09.h5';
plist = 'H5P_DEFAULT';
fNames = fieldnames(params);
for ii = 1:numel(fNames)
temp = params.(fNames{ii});
if ~isnan(str2double(temp))
params.(fNames{ii}) = str2double(temp);
Vsize(ii) = 8;
Vclass{ii} = 'H5T_NATIVE_DOUBLE';
elseif ischar(temp)
temp = char(pad(params.(fNames{ii}),15,'right'));
params.(fNames{ii}) = temp;
Vsize(ii) = strlength(temp);
Vclass{ii}= 'H5T_NATIVE_CHAR';
end
end
H5F.create(outname, 'H5F_ACC_TRUNC', 'H5P_DEFAULT', 'H5P_DEFAULT');
fid = H5F.open(outname,'H5F_ACC_RDWR',plist);
H5G.create(fid,'/SETTINGS',plist,plist,plist);
gid = H5G.open(fid,'/SETTINGS');
cid = H5T.create('H5T_COMPOUND', sum(Vsize));
for ii = 1:numel(fNames)
if ii == 1
H5T.insert(cid, fNames{ii}, 0, H5T.copy(Vclass{ii}));
else
if strcmp(Vclass{ii},'H5T_NATIVE_CHAR')
string_type = H5T.array_create(H5T.copy('H5T_NATIVE_CHAR'), 1, 20);
H5T.insert(cid, fNames{ii}, sum(Vsize(1:(ii-1))), string_type)
else
H5T.insert(cid, fNames{ii}, sum(Vsize(1:(ii-1))), H5T.copy(Vclass{ii}));
end
end
end
dsid = H5S.create_simple(1,1,[]);
dsetid = H5D.create(fid, '/SETTINGS/PASS', cid, dsid, 'H5P_DEFAULT');
H5D.write(dsetid, cid, 'H5S_ALL', 'H5S_ALL', 'H5P_DEFAULT', params);
Error using hdf5lib2
The class of input data must be integer instead of char when the HDF5 class is H5T_STD_I8LE.

Error in H5D.write (line 100)
H5ML.hdf5lib2('H5Dwrite', varargin{:});
H5D.close(dsetid)
H5S.close(did)
H5T.close(cid)
H5G.close(gid)
H5F.close(fid)

Answers (1)

Prathamesh
Prathamesh on 17 Feb 2025
I understand that while transitioning code to MATLAB, an error occurs when writing strings, indicating a datatype mismatch between the expected integer type and the provided character type. Despite converting strings to characters and ensuring uniform lengths, the issue persists. The problem is specific to strings, as removing them allows numeric data to be written successfully.
According to my understanding these are the required changes that might solve the above issue.
  1. Null-Termination for Strings: The string length, say “maxLength”, is set to include a space for the null terminator. This ensures that strings are correctly handled as fixed-length strings.
  2. Datatype Size: For each string, the datatype size is set to accommodate the full length including the null terminator.
Here is the modified code that takes care of the issues mentioned above:
params.floc = 'C:\Folder1';
params.fname = 'fname.txt';
params.value = 10;
outname = 'test09.h5';
plist = 'H5P_DEFAULT';
fNames = fieldnames(params);
Vsize = zeros(1, numel(fNames));
Vclass = cell(1, numel(fNames));
for ii = 1:numel(fNames)
temp = params.(fNames{ii});
if isnumeric(temp)
Vsize(ii) = 8; % Size for double
Vclass{ii} = 'H5T_NATIVE_DOUBLE';
elseif ischar(temp)
maxLength = length(temp) + 1; % Include null terminator
temp = char(pad(params.(fNames{ii}), maxLength - 1, 'right'));
params.(fNames{ii}) = temp;
Vsize(ii) = maxLength;
Vclass{ii} = 'H5T_NATIVE_CHAR';
end
end
H5F.create(outname, 'H5F_ACC_TRUNC', 'H5P_DEFAULT', 'H5P_DEFAULT');
fid = H5F.open(outname, 'H5F_ACC_RDWR', plist);
H5G.create(fid, '/SETTINGS', plist, plist, plist);
gid = H5G.open(fid, '/SETTINGS');
cid = H5T.create('H5T_COMPOUND', sum(Vsize));
offset = 0;
for ii = 1:numel(fNames)
if strcmp(Vclass{ii}, 'H5T_NATIVE_CHAR')
string_type = H5T.copy('H5T_C_S1');
H5T.set_size(string_type, Vsize(ii)); // Set size for null-terminated strings
H5T.insert(cid, fNames{ii}, offset, string_type);
else
H5T.insert(cid, fNames{ii}, offset, H5T.copy(Vclass{ii}));
end
offset = offset + Vsize(ii);
end
dsid = H5S.create_simple(1, 1, []);
dsetid = H5D.create(fid, '/SETTINGS/PASS', cid, dsid, 'H5P_DEFAULT');
H5D.write(dsetid, cid, 'H5S_ALL', 'H5S_ALL', 'H5P_DEFAULT', params);
H5D.close(dsetid);
H5S.close(dsid);
H5T.close(cid);
H5G.close(gid);
H5F.close(fid);
I have added comments wherever required. Since I don’t have the artifacts and the files which you are using, I was unable to verify the solution at my end, but this might work for you.

Products


Release

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!