Code covered by the BSD License  

Highlights from
Fast serialize/deserialize

4.88889

4.9 | 9 ratings Rate this file 51 Downloads (last 30 days) File Size: 8.65 KB File ID: #34564

Fast serialize/deserialize

by

 

12 Jan 2012 (Updated )

These functions can serialize most MATLAB data structures into a byte vector and vice versa.

| Watch this File

File Information
Description

This is an optimized rewrite of Tim Hutt's Serialize/Deserialize functions (it is up to 10x faster on arcane data structures) and supports a few additional data types.

Known limitations:
* Java objects cannot be serialized
* Arrays with more than 255 dimensions have their last dimensions clamped
* Handles to nested/scoped functions can only be deserialized when their parent functions
  support the BCILAB argument reporting protocol (e.g., by using arg_define).
* New MATLAB objects need to be reasonably friendly to serialization; either they support
  construction from a struct, or they support loadobj(struct), or all their important properties
  can be set via set(obj,'name',value)

It has been tested relatively extensively but if you catch a bug, let me know!

Acknowledgements

Serialize/Deserialize inspired this file.

Required Products MATLAB
MATLAB release MATLAB 7.11 (R2010b)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (11)
08 May 2014 Ralph Coleman

In deserialize_object, you should distinguish between graphical handles and handles to objects defined with classdef. I modified the code in the following way to get it to work:
if ishandle(v)
set(v,fn{1},conts.(fn{1}));
else
v.(fn{1}) = conts.(fn{1});
end

20 Mar 2014 Jan

Unfortunetaly, I cannot save my save file. Using '-v7.3', it needs about one minute to save (filesize is about 300MB). When I try to serialize the data, after lots of warnings "Calling STRUCT on an object ..." I get an OUT OF MEMORY Error.

My Save object consists of several (handle) objects and is, as I said, about 300MB in size when saved with '-v7.3'.

Is there anything I can do? (My Computer is win7 32bit with 3gb ram)

Thanks in advance!

Jan

01 Mar 2014 Andrew Champion

Great tool, but there is a bug I encountered that may also be the underlying cause of Mahdi's problems. In deserialization of horizontal string cell arrays, hlp_deserialize.m:246, the string splitting loop should be

for 1:length(splits)-1

not how it currently is written,

for 1:length(lengths)

As it's currently written it will not properly deserialize multidimensional horizontal string cell arrays.

09 Feb 2014 Chethan Pandarinath  
03 Dec 2013 Joao Hespanha

Great tools. It can really improve the saving speed.

I made some changes to code posted here to make it compatible with the new table and categorical classes. I've not tested enough to post to, but please let me know if someone wants to try it.

26 Oct 2013 Mahdi Kefayati

First off, this code is amazing! I helped serializing a cell array which would take ~26GB to merely ~1GB and subsequently saved me a lot of time saving and loading the file.

I was wondering if it is possible to improve the performance of this function by using parallelism.

Also, it seems, that I am facing a bug. What I serialize is a cell array similar to what is generated by textscan when similar filed grouping is enabled:
offers =

[15445674x1 double] {15445674x3 cell} [15445674x81 double] {15445674x1 cell} [15445674x6 double]

When I serialize and then deserialize this cell array, strings in some, but not all, of the columns of one of the cell arrays disappear. That is:
>> offers{2}(1:5,:)

ans =

'N' 'BUCHAN_BUCHANG2' 'HYDRO'
'N' 'MGSES_CT1' 'SCLE90'
'N' 'GRSES_UNIT2' 'GSREH'
'N' 'OKLA_OKLA_G1_J03' 'CLLIG'
'N' 'DOWGEN_DOW_G37' 'SCLE90'

However:
>> offers_d=hlp_deserialize(hlp_serialize(offers));
>> offers_d{2}(1:5,:)

ans =

'N' [] []
'N' [] []
'N' [] []
'N' [] []
'N' [] []

Any thoughts on what is the culprit?

28 Aug 2013 Emmanuel Farhi

Absolutely incredible gain in read/write of MAT files. However, these MAT files can not be given to other users/applications without the deserialize tool.

16 Jul 2013 James  
21 Apr 2013 Yair Altman

This is an extremely important utility. Too bad it's not part of the standard Matlab language.

It is enormously useful for data persistence (saving) performance, especially for class objects that would otherwise require the slow save('-v7') or the even slower save('-v7.3') - using this utility, we can easily save using the much faster save('-v6'). On my system, the run-time and file-size of saving non-numeric data were reduced by 1-3 orders of magnitude! The interesting part is that the performance and file-size improvements were across-the-board, using any supported option for saving the data (save -v6/-v7/-v7.3 or the savefast utility on FEX).

This utility could be improved for saving Java objects, by checking (using a simple try-catch) whether the object implements a writeObject() method into a ByteArrayOutputStream. See http://docs.oracle.com/javase/6/docs/api/java/io/Serializable.html for more details.

24 May 2012 Moritz

Extremely nice, exactly what I needed, except: Being compatible with objects having no no-args constructor.

Anyway, a little change helped, which might be useful to implement in general:

In hlp_deserialize.m, in the func deserialize_object(m, pos):
replace the construction v = feval(cls) by:
constr = str2func(cls);
args = cell(1, nargin(constr));
v = constr(args{:});

This call calls the constructor with empty arguments, exactly the amount it needs. Sure, the constuctor can still crash, but this would happen if there is no no-args constructor anyway. Many constructors only set member variables - and since they are overwritten in the following lines anyway, we just set them to null...

25 Jan 2012 Moti Zilberman

Very nice, useful and clean couple of functions - the OOP support in particular has been a lifesaver.

You might want to incorporate support for saveobj to complement the loadobj support. What I did, which seems to work, consists of two changes:
* In deserialize_object, call deserialize_value instead of deserialize_struct.
* In serialize_object, first try to do hlp_serialize(saveobj(v)), and if that fails fall back (via try..catch) to the current behavior of serialize_struct.

Then any object which implements the saveobj/loadobj duo following the MATLAB-established API should work with these functions just as well.

Two miscellaneous comments:
1. In hlp_deserialize.m, lines 378 and 380 there is a use of "disp" and "disp_once" where I think warning and warn_once are meant - the code unfortunately doesn't run as downloaded due to this mixup.
2. What is warn_once and why all the duplicity and try..catch blocks? IMHO if it's necessary, just include it with your code (as a subfunction even), or alternatively just use plain old warning, because the failed invocations + exception handling code carry a slight computational cost.

Updates
26 Jan 2012

Included the improvements suggested by Moti Zilberman. Also corrected a bug involving sparse scalars (which could not be serialized before under some circumstances).

Contact us