Possible bug in H5D.write, truncation of VLEN strings
2 views (last 30 days)
Show older comments
Hello,
I have discovered a potential bug, or at least some flaky behavior when using the low level HDF5 write function. When I try to write a long string as a variable length string, it seems to get truncated at 512 bytes (511 + the terminating null). I can write it just fine as a fixed length string.
The minimal script below reproduces the error. I see this on R2012a on both Linux and Mac. Am I missing a parameter or function call that controls the VLEN buffer size, or is something improperly hard coded in the underlying mex function?
Cheers, Souheil
-------------
% Create a long string
str = repmat('Hello from matlab. ',[1 1000]);
fprintf('Size of string = %d\n',length(str));
% Create an HDF5 file
filename = 'vlen_string_bug.h5';
fid = H5F.create(filename,'H5F_ACC_TRUNC','H5P_DEFAULT','H5P_DEFAULT');
% Write to a dataset as a variable length string
VLstr_type = H5T.copy('H5T_C_S1');
H5T.set_size(VLstr_type,'H5T_VARIABLE');
space = H5S.create_simple(1, 1, []);
dset = H5D.create(fid, 'VLstr', VLstr_type, space, 'H5P_DEFAULT');
fprintf('Size of VLEN_BUF before = %d\n',H5D.vlen_get_buf_size(dset, VLstr_type, space));
H5D.write(dset, VLstr_type, 'H5S_ALL', 'H5S_ALL', 'H5P_DEFAULT', {str});
fprintf('Size of VLEN_BUF after = %d\n',H5D.vlen_get_buf_size(dset, VLstr_type, space));
H5T.close(VLstr_type);
H5S.close(space);
H5D.close(dset);
% Write to a dataset as a fixed length string
Fstr_type = H5T.copy('H5T_C_S1');
H5T.set_size(Fstr_type, length(str));
space = H5S.create_simple (1, 1, []);
dset = H5D.create (fid, 'Fstr', Fstr_type, space, 'H5P_DEFAULT');
H5D.write(dset, Fstr_type, 'H5S_ALL', 'H5S_ALL', 'H5P_DEFAULT', str);
H5T.close(Fstr_type);
H5S.close(space);
H5D.close(dset);
% Close the file
H5F.close(fid);
% Read the strings back in using the high level read function
t = h5read(filename,'/VLstr');
vlstr = t{1};
fprintf('Size of VLEN string on disk = %d\n',length(vlstr));
t = h5read(filename,'/Fstr');
fstr = t{1};
fprintf('Size of fixed string on disk = %d\n',length(fstr));
0 Comments
Accepted Answer
More Answers (1)
per isakson
on 15 Sep 2012
Edited: per isakson
on 15 Sep 2012
However, HDF5 User's Guide, page 228, says:
[...] a length and data buffer must be allocated.
I don't see how.
This is not much of an answer. However, could it be that 512 is a default value that needs to be replaced by an appropriate value.
4 Comments
Oleg Komarov
on 15 Sep 2012
Edited: Oleg Komarov
on 15 Sep 2012
I found a description on the fields for H5F.get_mdc_config on http://www.hdfgroup.org/HDF5/doc/RM/RM_H5F.html#File-SetMdcConfig and maybe the properties set_initial_size and initial_size are relevant to the buffer.
However, I am unsure where to set those properties, at the File, dataset or property list level (H5F, H5D, H5P)...
I think it would be faster if you submitted a technical support request to TMW or to the HDFgroup.
Post any solution here (I am curious as well).
per isakson
on 16 Sep 2012
Edited: per isakson
on 16 Sep 2012
Here is a link to hdf-forum. A few Matlab related questions have been answered there. I cannot really contribute.
See Also
Categories
Find more on Workspace Variables and MAT-Files in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!