Save using -append behaves differently when replacing objects vs replacing arrays
14 views (last 30 days)
Show older comments
Hi all,
I'm trying to save some variables to a .mat file, appending to that file if the variable is new, overwriting if it's already there. I've found different behaviour if the variable is an object vs if it's a simple array. Here's a demonstration:
Firstly, arrays
p = rand(1000,3);
save('test.mat','p')
whos '-file' test.mat
d1 = dir('test.mat');
p = rand(10,3);
save('test.mat','p','-append')
whos '-file' test.mat
d2 = dir('test.mat');
fprintf('Saved .mat file:\n1000 pt triangulation: %d bytes\n 10 pt triangulation: %d bytes.\n',...
d1.bytes, d2.bytes)
Results in:
Saved .mat file:
1000 pt triangulation: 22907 bytes
10 pt triangulation: 430 bytes.
Now, objects
p = delaunayTriangulation(rand(1000,3));
save('test.mat','p')
whos '-file' test.mat
d1 = dir('test.mat');
p = delaunayTriangulation(rand(10,3));
save('test.mat','p','-append')
whos '-file' test.mat
d2 = dir('test.mat');
fprintf('Saved .mat file:\n1000 pt triangulation: %d bytes\n 10 pt triangulation: %d bytes.\n',...
d1.bytes, d2.bytes)
Results in:
Saved .mat file:
1000 pt triangulation: 72833 bytes
10 pt triangulation: 73319 bytes.
As you can see, the first -append with arrays did exactly as expected. The large variable p was overwritten by the smaller variable with the same name. The resulting filesize was reduced (as is expected).
The second -append worked differently. Here, it seems that the original large p object stayed in the file, even after it was replaced with a much smaller object with the same name. Overall, the filesize actually increased. This is surprising. Is this a bug? A feature with dubious utility? It's an annoyance at the moment. The only way I've found so far is to reload the whole contents of a .mat file into memory every time I want to -append a replacement to ANY of the object variables in that file. This would be fine with small files, but I'm dealing with a few hundred MB files containing a suite of 100 MB objects - it's really cumbersome to reload all when I only need to save one.
Thanks, Sven.
0 Comments
Accepted Answer
Walter Roberson
on 3 Feb 2016
save -append has always been defined to patch out the old data and add the new data to the end of the .mat file without necessarily reducing the file size at all. It is possible that there is an optimization for the case of a single numeric array or the case of a numeric array that happens to be the last thing in the file, but that has never been guaranteed. At no time has save -append been defined as needing to remove the old data and "dropping down" whatever follows to fill the hole. The -append flag is for fast updating of a .mat file, not for space efficiency.
If you need to update variables inside a .mat file you should consider using a -v7.3 file and using matFile()
2 Comments
Simon Matte
on 16 Sep 2020
I have the same issue,
Bless you for coming back to this and re-summarizing the situation.
I'm glad I was not alone to encounter this problematic and be confused at the misleading documentation...
More Answers (0)
See Also
Categories
Find more on Sparse Matrices in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!