Code covered by the BSD License  

Highlights from

5.0 | 20 ratings Rate this file 140 Downloads (last 30 days) File Size: 6.33 KB File ID: #31272 Version: 1.4



Jan Simon (view profile)


01 May 2011 (Updated )

MD5 or SHA hash for array, struct, cell or file

| Watch this File

File Information

DATAHASH - Checksum for Matlab array, struct, cell or file
Hash = DataHash(Data, Opt)
  Data: Array of built-in types (U)INT8/16/32/64, SINGLE, DOUBLE (real or complex)
        CHAR, LOGICAL, CELL, STRUCT (scalar or array, nested), function_handle.
  Opt: Options struct:
        Opt.Method: 'SHA-1', 'SHA-256', 'SHA-384', 'SHA-512', 'MD2', 'MD5'.
        Opt.Format: 'hex', 'HEX', 'double', 'uint8', 'base64'.
            'file': Data is a file name.
            'bin': Only the contents of Data is considered.
                    Data must be numerical or a CHAR.
            'array': Default, contents and type of Data are considered.
                    Nested CELLs and STRUCTs possible.
  Hash: String or numeric vector.
Default: MD5, hex:
  DataHash([]) % 7de5637fd217d0e44e0082f4d79b3e73
SHA-1, Base64:
  S.a = uint8([]);
  S.b = {{1:10}, struct('q', uint64(415))};
  Opt.Format = 'base64';
  Opt.Method = 'SHA-1';
  DataHash(S, Opt) % ZMe4eUAp0G9TDrvSW0/Qc0gQ9/A

This function uses James Tursa's smart and fast TYPECASTX, if installed:
For Matlab 6.5 installing TYPECASTX is obligatory to run DataHash.

Michael Kleder's "Compute Hash" works very similar, but does not accept structs, cells or files:
"CalcMD5" uses a faster C-Mex to create only the MD5 sum for an array or file:

Tested: Matlab 6.5, 7.7, 7.8, 7.13, WinXP, Win7/64, Java: 1.3.1_01, 1.6.0_04.

UPDATED 30-Mar-2015: Now the hash for "array" mode includes the number of dimensions also. Earlier versions replied the same hash for zeros(1,1) and zeros(1,1,0). Therefore the new hash values differ from older versions in "array" mode.

Bugreports and enhancement requests are welcome.


Typecast And Typecastx C Mex Functions, Calc Md5, and Serialize/Deserialize inspired this file.

This file inspired Cache Results, Cachedcall, and Lynx Matlab Toolbox.

MATLAB release MATLAB 7.13 (R2011b)
MATLAB Search Path
Other requirements James Tursa's TYPECASTX is recommended. For Matlab 6.5 it is obligatory.
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (38)
15 Oct 2015 Jan Simon

Jan Simon (view profile)

@Uilke: Wow, this is a bug in the documentation. 240f... is the expected output. The results have been changed in the version submitted on 30-Mar-2015, but I forgot to update the help text accordingly.

Comment only
15 Oct 2015 Uilke Stelwagen

Tried datahash([]) on Win7 64-bits with ML 7.13 (R2011b) and 8.6 (R2015b) and with both get 240f5f01f052bd89f38da2165dcf25c7 instead of the in the help mentioned 7de5637fd217d0e44e0082f4d79b3e73
Any ideas why?

Comment only
18 Sep 2015 Matt Raum

@Matt Raum: I found a workaround using the undocumented serialization function getByteStreamFromArray


gets me what I need.

Comment only
18 Sep 2015 Matt Raum

Jan -- Great work on this, very handy! I'd like to be able to use this to hash an instance of a custom class but I'm getting the error below:

Warning: Type of variable not considered: MyClass
> In DataHash>CoreHash_ at 336
In DataHash at 253

Can you help?

18 Sep 2015 Matt Raum  
13 Sep 2015 Jan Simon

Jan Simon (view profile)

@Ivan Cordon: The effect is explained in NOTES in the help section and was discussed at 26 May 2012:
To get the same result as in the online generator, only the binary contents of the data must be considered, not the Matlab class, and only the 8-bit ASCII part of the 16-bit Matlab CHAR:
Opt.Method='SHA-256'; Opt.Input='bin';
DataHash(uint8('ivan'), Opt)
Then you get cd0b9452... also.

Or a little bit shorter with the newest version:
Opt.Method='SHA-256'; Opt.Input='ascii';
DataHash('ivan', Opt)

Comment only
05 Sep 2015 Ivan Cordon

Dear Jan,

The program works for me but I seem to be getting a wrong result. I inputting the Opt struct with these parameters:

Method: 'SHA-256'

and calling the function as follows:

Hash = DataHash('ivan', Opt)

the result i am getting is:


whereas checking on the different online generators ( i get this:


Do you know why I am getting this difference?


13 Aug 2015 Jan Simon

Jan Simon (view profile)

@Haitam: Do you want to obtain the clear text from the hash value? This is possible only if the input has not more bits than the hash value. For longer messages the result is not unique. And even for short messages the hashing algorithms are not designed for a reverse analysis. It is possible e.g. by a brute force attack, but this problem is not covered by my submission.

Comment only
01 Aug 2015 Haitham

Dear simon , the code is working great , i would like to know how i can recompute the hash value?
The data hash generated 558f68181d2b0c9d57d41ce7aa36b71d9
i would like to get the original value before hashing it.
thanks a lot...

Comment only
17 Jun 2015 Jan Simon

Jan Simon (view profile)

@Haitham: Look at the examples for a valid method to call this function. The error message tells you that you did not provide input arguments, or too many of them. Check the command or contact me under the address given in the help section (THISYEAR is "2010", sorry, a typo).

Comment only
17 Jun 2015 Haitham

Dear jan
thanks for the file , but i am getting error while running the code it self.

Error using DataHash>Error_L (line 466)
*** DataHash: 1 or 2 inputs required.

Error in DataHash (line 177)
Error_L('BadNInput', '1 or 2 inputs required.');

Comment only
12 Jun 2015 Alberto Gutierrez  
13 Apr 2015 Daniel Golden

Great submission, Jan. Consider putting the code on GitHub so that users can more easily comment and contribute.

16 Mar 2015 Jan Simon

Jan Simon (view profile)

Unfortunately there are some collisions (same hash for different data). E.g.: The scalar 0 and the empty double array [1 x 1 x 0] get the same hash. I'm going to publish a new version, which fixes these bugs, but produce other hashs in consequence. In addition sparse arrays are considered.

The submission CalcMD5 is updated also, such that (nested) cells and structs are considered also. Because the overhead of calling Java is omitted, this is dramatically faster than DataHash, I get a speedup factor > 100 for nested structs.

Comment only
12 Mar 2015 Aslak Grinsted

Aslak Grinsted (view profile)

11 Nov 2014 Noam Greenboim

Well written, good running time

13 Sep 2014 Giorgio

Well written!

04 Jul 2014 José Crespo Barrios  
07 May 2014 Andreas Bonelli  
22 Jan 2014 Christos Mitsakos  
16 Jan 2014 Clemens  
09 Jul 2013 Sergio Zlotnik  
13 Feb 2013 Nath

Nath (view profile)

function handle refering to struct containing the function will create infinite loop. Is there any workaround ?


d= dynamicprops();
d.f= @(varargin) struct2cell(d);
DataHash(d.f) % infinite loop

Comment only
09 Feb 2013 Arindam Bose

Arindam Bose (view profile)

Hi, Do anyone have any idea, how to decrypt an MD5 code?

01 Nov 2012 Igor

Igor (view profile)

31 Oct 2012 David

David (view profile)

Hi, I'm just wondering if this function works for p-files? Line 144-147 tells the function to stop if the file is not an m-file. When I comment these lines out, it seems to provide the correct hash for the p-file.

02 Aug 2012 Jan Simon

Jan Simon (view profile)

Thanks, Jan, a good idea. I will include this in the next version.

Comment only
24 Jul 2012 Jan Achterhold

Great function! Just as a little improvement, I added support for Java and MATLAB objects by calling their hashCode function. Just insert

elseif (isobject(Data) || isjava(Data))....
&& ismethod(Data, 'hashCode')
Engine = CoreHash(Data.hashCode, Engine);

into the CoreHash() and CoreHash_() functions (in the main if branches).

Regards, Jan

26 May 2012 Jan Simon

Jan Simon (view profile)

The examples use 8-bit CHAR strings, while Matlab uses 16 bits per CHAR. In addition DataHash includes information about the type and dimension of the input, as described in the help text. Because e.g. Mido asked for a version, which considers the actual data only last September, I've included this feature last year. Note the conversion to 8-bits:
Opt.Input = 'bin'; Opt.Method = 'MD5';
DataHash(uint8('The quick brown fox jumps over the lazy dog.'), Opt)
% >> e4d909c290d0fb1ca068ffaddf22cbd0
as expected.
But DataHash fails for the empty string and binary input! Thanks for finding this bug, it will be fixed soon.

Comment only
23 May 2012 Kotya Karapetyan

It would be nice to be able to check the returned values against some public results. For example Wikipedia mentions MD5("") = d41d8cd98f00b204e9800998ecf8427e and MD5("The quick brown fox jumps over the lazy dog.")
= e4d909c290d0fb1ca068ffaddf22cbd0, and I don't know how to get the same from DataHash to check it.

Comment only
26 Apr 2012 Jan Simon

Jan Simon (view profile)

@Oyvind: I have posted an MD5-calculator also as MEX-function. But it does not allow the direct accumulation of the hash for nested cells or structs. But you could do a XORing of the partial hashes. But this will not be trivial.

Comment only
24 Apr 2012 Oyvind

Oyvind (view profile)

Is it possible to run Datahash without Java (I am stuck with the limitation of excel 2003 using a compiled version of datahash tha requires JVM).

Comment only
19 Sep 2011 Jan Simon

Jan Simon (view profile)

@Mido: Now you can use the Opt.Input='bin' method to create the hash for the raw data.

Comment only
07 Sep 2011 Jan Simon

Jan Simon (view profile)

@Mido: DataHash considers the class and dimensions of the inputs, otherwise UINT8([0,0]) and UINT16(0) would have the same hash.
I'm going to add the option, that only the contents of the input is considered. But even then the MD5 of 'I am not happy' will not be '59b4...', because Matlab uses 16 bits for a CHAR - you look for UINT8('I am not happy').
You can find a fast Md5 tool here:

Comment only
04 Sep 2011 Mido Mido

It does not work with me!!
When i tried:

Opt.Format = 'HEX';
Opt.Method = 'MD5';
DataHash('I am not happy', Opt)

it gives me : "7C23124A8F69D72A65C0E86A4B9075CF"
although the correct is:

The same also for SHA-1!!

Could any one help?

Comment only
20 Jul 2011 Martin

Martin (view profile)

09 Jun 2011 Aaron

Aaron (view profile)

Extremely well made function.
Very easy to use.
Works correctly.

03 May 2011 Francois Rongère

Very useful for my simulations parameters sets !

Thank you Jan,



12 Sep 2011 1.2

Binary mode add to consider only the contents of the data.

27 Jun 2012 1.3

Accept empty input for binary mode now.

30 Mar 2015 1.4

Fixed bugs: Strings and empty array for "binary" mode. For "array" mode [] and zeros(1,1,0) had the same hash before.

15 Oct 2015 1.4

In the version 30-Mar-2015 the results have been changed, but the help section has not been adjusted. This is fixed now.
Structs arrays are processed much faster now and the checksum differs from earlier versions.

Contact us