Code covered by the BSD License  

Highlights from
DataHash

5.0

5.0 | 13 ratings Rate this file 99 Downloads (last 30 days) File Size: 5.66 KB File ID: #31272

DataHash

by

 

01 May 2011 (Updated )

MD5 or SHA hash for array, struct, cell or file

| Watch this File

File Information
Description

DATAHASH - Checksum for Matlab array, struct, cell or file

Hash = DataHash(Data, Opt)
  Data: Array of built-in types (U)INT8/16/32/64, SINGLE, DOUBLE (real or complex)
        CHAR, LOGICAL, CELL, STRUCT (scalar or array, nested), function_handle.
      
  Opt: Options struct:
        Opt.Method: 'SHA-1', 'SHA-256', 'SHA-384', 'SHA-512', 'MD2', 'MD5'.
        Opt.Format: 'hex', 'HEX', 'double', 'uint8', 'base64'.
        Opt.Input:
            'file': Data is a file name.
            'bin': Only the contents of Data is considered.
                    Data must be numerical of a CHAR.
            'array': Default, contents and type of Data are considered.
                    Nested CELLs and STRUCTs possible.

  Hash: String or numeric vector.

EXAMPLES:
Default: MD5, hex:
  DataHash([]) % 7de5637fd217d0e44e0082f4d79b3e73
SHA-1, Base64:
  S.a = uint8([]);
  S.b = {{1:10}, struct('q', uint64(415))};
  Opt.Format = 'base64';
  Opt.Method = 'SHA-1';
  DataHash(S, Opt) % ZMe4eUAp0G9TDrvSW0/Qc0gQ9/A

This function uses James Tursa's smart and fast TYPECASTX, if installed:
  http://www.mathworks.com/matlabcentral/fileexchange/17476
For Matlab 6.5 installing TYPECASTX is obligatory to run DataHash.

Michael Kleder's "Compute Hash" works very similar, but does not accept structs, cells or files:
  http://www.mathworks.com/matlabcentral/fileexchange/8944
"CalcMD5" uses a faster C-Mex to create only the MD5 sum for an array or file:
  http://www.mathworks.com/matlabcentral/fileexchange/25921

Tested: Matlab 6.5, 7.7, 7.8, 7.13, WinXP, Win7/64, Java: 1.3.1_01, 1.6.0_04.

Bugreports and enhancement requests are welcome.

Acknowledgements

Typecast And Typecastx C Mex Functions, Calc Md5, and Serialize/Deserialize inspired this file.

This file inspired Cache Results.

MATLAB release MATLAB 7.8 (R2009a)
Other requirements James Tursa's TYPECASTX is recommended. For Matlab 6.5 it is obligatory.
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (22)
13 Sep 2014 Giorgio

Well written!

04 Jul 2014 José Crespo Barrios  
07 May 2014 Andreas Bonelli  
22 Jan 2014 Christos Mitsakos  
16 Jan 2014 Clemens  
09 Jul 2013 Sergio Zlotnik  
13 Feb 2013 Nath

function handle refering to struct containing the function will create infinite loop. Is there any workaround ?

Exemple:

d= dynamicprops();
addprop(d,'f');
d.f= @(varargin) struct2cell(d);
DataHash(d.f) % infinite loop

09 Feb 2013 Arindam Bose

Hi, Do anyone have any idea, how to decrypt an MD5 code?

01 Nov 2012 Igor  
31 Oct 2012 David

Hi, I'm just wondering if this function works for p-files? Line 144-147 tells the function to stop if the file is not an m-file. When I comment these lines out, it seems to provide the correct hash for the p-file.

02 Aug 2012 Jan Simon

Thanks, Jan, a good idea. I will include this in the next version.

24 Jul 2012 Jan Achterhold

Great function! Just as a little improvement, I added support for Java and MATLAB objects by calling their hashCode function. Just insert

elseif (isobject(Data) || isjava(Data))....
&& ismethod(Data, 'hashCode')
Engine = CoreHash(Data.hashCode, Engine);

into the CoreHash() and CoreHash_() functions (in the main if branches).

Regards, Jan

26 May 2012 Jan Simon

The examples use 8-bit CHAR strings, while Matlab uses 16 bits per CHAR. In addition DataHash includes information about the type and dimension of the input, as described in the help text. Because e.g. Mido asked for a version, which considers the actual data only last September, I've included this feature last year. Note the conversion to 8-bits:
Opt.Input = 'bin'; Opt.Method = 'MD5';
DataHash(uint8('The quick brown fox jumps over the lazy dog.'), Opt)
% >> e4d909c290d0fb1ca068ffaddf22cbd0
as expected.
But DataHash fails for the empty string and binary input! Thanks for finding this bug, it will be fixed soon.

23 May 2012 Kotya Karapetyan

It would be nice to be able to check the returned values against some public results. For example Wikipedia mentions MD5("") = d41d8cd98f00b204e9800998ecf8427e and MD5("The quick brown fox jumps over the lazy dog.")
= e4d909c290d0fb1ca068ffaddf22cbd0, and I don't know how to get the same from DataHash to check it.

26 Apr 2012 Jan Simon

@Oyvind: I have posted an MD5-calculator also as MEX-function. But it does not allow the direct accumulation of the hash for nested cells or structs. But you could do a XORing of the partial hashes. But this will not be trivial.

24 Apr 2012 Oyvind

Hi,
Is it possible to run Datahash without Java (I am stuck with the limitation of excel 2003 using a compiled version of datahash tha requires JVM).
Oyvind

19 Sep 2011 Jan Simon

@Mido: Now you can use the Opt.Input='bin' method to create the hash for the raw data.

07 Sep 2011 Jan Simon

@Mido: DataHash considers the class and dimensions of the inputs, otherwise UINT8([0,0]) and UINT16(0) would have the same hash.
I'm going to add the option, that only the contents of the input is considered. But even then the MD5 of 'I am not happy' will not be '59b4...', because Matlab uses 16 bits for a CHAR - you look for UINT8('I am not happy').
You can find a fast Md5 tool here: http://www.mathworks.com/matlabcentral/fileexchange/25921-calcmd5

04 Sep 2011 Mido Mido

It does not work with me!!
When i tried:

Opt.Format = 'HEX';
Opt.Method = 'MD5';
DataHash('I am not happy', Opt)

it gives me : "7C23124A8F69D72A65C0E86A4B9075CF"
although the correct is:
"59b469ea3ffbe72cf4983facf13cbe1f"

The same also for SHA-1!!

Could any one help?

20 Jul 2011 Martin  
09 Jun 2011 Aaron

Extremely well made function.
Very easy to use.
Works correctly.

03 May 2011 Francois Rongère

Very useful for my simulations parameters sets !

Thank you Jan,

Regards,

François.

Updates
12 Sep 2011

Binary mode add to consider only the contents of the data.

27 Jun 2012

Accept empty input for binary mode now.

Contact us