Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
machineformat and 'long' precision (32bit vs 64bit)

Subject: machineformat and 'long' precision (32bit vs 64bit)

From: Benjamin Kraus

Date: 10 Aug, 2011 19:20:26

Message: 1 of 6

I'm trying to write a native MATLAB script that will be used to read data out of a proprietary format data file. My goal is to make it work well across as many different platforms/OSs as possible. To this end, I've been looking into the documentation regarding the 'machineformat' input argument to 'fopen', 'fread', and 'fwrite'.

The five machine formats available are:
'n' or 'native' - The byte ordering that your system uses (default)
'b' or 'ieee-be' - Big-endian ordering
'l' or 'ieee-le' - Little-endian ordering
's' or 'ieee-be.l64' - Big-endian ordering, 64-bit data type
'a' or 'ieee-le.l64' - Little-endian ordering, 64-bit data type

I'm familiar with the difference between big-endian and little-endian, and based on my testing I've been able to switch between these two types and demonstrate a difference when reading a test data file. However, I haven't been able to detect any difference in behavior between the 32-bit machine formats and 64-bit machine formats. I want to make sure that this script works properly on both machine types, so I'm wondering if anyone knows how changing from a 32-bit machine type to a 64-bit machine type effects the 'fread' function.

I've tested the behavior of 'fread' on three machine. One running Ubuntu 10.10 (x64), with the native machine format 'ieee-le.l64', one running Windows 7 (x64) with native machine format 'ieee-le', and the last running Mac OS (x64) with native machine format 'ieee-le.l64'. I have been unable to detect and difference in behavior between all three test machines, both using the native format and specifying 32bit vs. 64bit. Also, I'm not sure why Windows 7 (x64) is using the 32 bit format by default.

My only guess regarding how a 32bit vs. 64bit machine format may effect file reading is with the 'long' and 'ulong' data types. According to the documentation the precision is system dependent: they are both 32-bits on 32-bit machines, and 64-bits on 64-bit machines. However, from my testing (on the same three machines), this doesn't appear to be the case. 'long' and 'ulong' both always appear to be 32-bit, regardless of the actual OS or the machine format specified.

All testing was done on Matlab R2010b.

Does anyone know what I'm missing? I can include my test script if people are interesting, but it is just a bunch of 'freads' specifying different machineformats.

- Ben

Subject: machineformat and 'long' precision (32bit vs 64bit)

From: dpb

Date: 10 Aug, 2011 19:35:38

Message: 2 of 6

On 8/10/2011 2:20 PM, Benjamin Kraus wrote:
> I'm trying to write a native MATLAB script that will be used to read
> data out of a proprietary format data file. My goal is to make it work
> well across as many different platforms/OSs as possible. To this end,
> I've been looking into the documentation regarding the 'machineformat'
> input argument to 'fopen', 'fread', and 'fwrite'.
>
> The five machine formats available are:
> 'n' or 'native' - The byte ordering that your system uses (default)
> 'b' or 'ieee-be' - Big-endian ordering
> 'l' or 'ieee-le' - Little-endian ordering
> 's' or 'ieee-be.l64' - Big-endian ordering, 64-bit data type
> 'a' or 'ieee-le.l64' - Little-endian ordering, 64-bit data type
>
> I'm familiar with the difference between big-endian and little-endian,
> and based on my testing I've been able to switch between these two types
> and demonstrate a difference when reading a test data file. However, I
> haven't been able to detect any difference in behavior between the
> 32-bit machine formats and 64-bit machine formats. I want to make sure
> that this script works properly on both machine types, so I'm wondering
> if anyone knows how changing from a 32-bit machine type to a 64-bit
> machine type effects the 'fread' function.

> I've tested the behavior of 'fread' on three machine. One running Ubuntu
> 10.10 (x64), with the native machine format 'ieee-le.l64', one running
> Windows 7 (x64) with native machine format 'ieee-le', and the last
> running Mac OS (x64) with native machine format 'ieee-le.l64'. I have
> been unable to detect and difference in behavior between all three test
> machines, both using the native format and specifying 32bit vs. 64bit.
> Also, I'm not sure why Windows 7 (x64) is using the 32 bit format by
> default.
>
> My only guess regarding how a 32bit vs. 64bit machine format may effect
> file reading is with the 'long' and 'ulong' data types. According to the
> documentation the precision is system dependent: they are both 32-bits
> on 32-bit machines, and 64-bits on 64-bit machines. However, from my
> testing (on the same three machines), this doesn't appear to be the
> case. 'long' and 'ulong' both always appear to be 32-bit, regardless of
> the actual OS or the machine format specified.
>
> All testing was done on Matlab R2010b.
>
> Does anyone know what I'm missing? I can include my test script if
> people are interesting, but it is just a bunch of 'freads' specifying
> different machineformats.
>
> - Ben

Well, it also depends on how you output the data in (one assumes)
fwrite(). If you write a default array from Matlab, it will be a double
float which is 8-bytes.

If you try to read that and use the 'single' modifier you'll get trash.

If you do a cast "single" from a default array and write it w/ the
'single' in fwrite() then you definitely better see a difference.

Similar arguments hold for writing 32- and 64-bit integers (signed or
unsigned). Unless you create a file that has something other than
default long reals, you'll not see any difference (sort of a tautology,
there, granted. :) )

So, your test script would have to also show the generation of the data
as well as the reading.

--

Subject: machineformat and 'long' precision (32bit vs 64bit)

From: Benjamin Kraus

Date: 11 Aug, 2011 00:03:13

Message: 3 of 6

dpb <none@non.net> wrote in message <j1umi8$hv3$1@speranza.aioe.org>...
> Well, it also depends on how you output the data in (one assumes)
> fwrite(). If you write a default array from Matlab, it will be a double
> float which is 8-bytes.
>
> If you try to read that and use the 'single' modifier you'll get trash.
>
> If you do a cast "single" from a default array and write it w/ the
> 'single' in fwrite() then you definitely better see a difference.
>
> Similar arguments hold for writing 32- and 64-bit integers (signed or
> unsigned). Unless you create a file that has something other than
> default long reals, you'll not see any difference (sort of a tautology,
> there, granted. :) )
>
> So, your test script would have to also show the generation of the data
> as well as the reading.

The test file I'm using is completely artificial data, I created a file by hand with 32 bytes in it (using a hex editor): 0123456789ABCDEF1032547698BADCFE0123456789ABCDEF1032547698BADCFE

However, the test file is irrelevant because of the way I'm testing it.

Here is part of my test scripts:
f = fopen('testfile','r');
bits = fread(f,inf,'ubit1'); % I used this to confirm the bits were what I intended.

frewind(f); datauint32 = fread(f,inf,'*uint32');
frewind(f); datauint64 = fread(f,inf,'*uint64');
frewind(f); dataulong = fread(f,inf,'*ulong');

I then checked the class of the data returned ('uint32', 'uint64', and 'uint32' respectively). Note, when telling 'fread' to use 'ulong' precision, it returned a 'uint32'. I also compared the values returned in 'datauint32' to 'dataulong' and found a match (while the values in 'datauint64' were predictably different). In theory this should work for any test file, as the data doesn't really matter, just how it is being interpreted by MATLAB.

The rest of my test script repeats those calls with different values for 'machineformat'.
I expected that when using a 64bit machineformat, 'ulong' would match the 'uint64', and while on a 32bit machineformat 'ulong' would match the 'uint32'. However, in all cases I tested, 'ulong' was always returned as 32bits. This seems contrary to the documentation, which says "long and ulong are 32 bits on 32-bit systems, and 64 bits on 64-bit systems".

To be honest I haven't even looked at 'fwrite' yet, although I assume that once I figure out how 'fread' is working, 'fwrite' should work the same (except in reverse).

- Ben

Subject: machineformat and 'long' precision (32bit vs 64bit)

From: dpb

Date: 11 Aug, 2011 02:52:45

Message: 4 of 6

dpb <none@non.net> wrote in message <j1umi8$hv3$1@speranza.aioe.org>...
...
... However, in all cases I tested, 'ulong' was always returned as
32bits. This seems contrary to the documentation, which says "long and
ulong are 32 bits on 32-bit systems, and 64 bits on 64-bit systems".

...

You missed the part that says

"The following platform dependent formats are also supported but
  they are not guaranteed to be the same size on all platforms.

         MATLAB C or Fortran Description
...
         'long' 'long' integer, 32 or 64 bits.
...
        'ulong' 'unsigned long' unsigned integer, 32 bits or 64 bits.
...

If you reliably want 32- or 64-bits, use a format that is guaranteed to
be what you want; 'long' or 'ulong' isn't. What you get is what you
get; on those platform/OS combinations you tested they happened to be
32-bit; on another they may be 64.

--

Subject: machineformat and 'long' precision (32bit vs 64bit)

From: Benjamin Kraus

Date: 11 Aug, 2011 04:33:13

Message: 5 of 6

dpb <none@non.net> wrote in message <j1vg5u$8k7$1@speranza.aioe.org>...
> You missed the part that says
>
> "The following platform dependent formats are also supported but
> they are not guaranteed to be the same size on all platforms.

Interesting, I missed that part because it doesn't seem to be in the documentation anymore. I just check the documentation for older versions of MATLAB. The documentation for R2009a contains that disclaimer.

http://www.mathworks.com/help/releases/R2009a/techdoc/ref/fread.html

As of R2009b they reformatted the File I/O pages of the documentation and removed that disclaimer. The current documentation has a very different table. The only two 'precisions' that it lists as "system dependent" are 'ulong' and 'long'. The table gives a definitive size for the remaining precisions, and includes the note "long and ulong are 32 bits on 32-bit systems, and 64 bits on 64-bit systems." However, it seems that is not necessarily the case. Perhaps the old disclaimer should be restored.

http://www.mathworks.com/help/techdoc/ref/fread.html

As you suggest, I guess I will stick with the precisions that have a guaranteed size, and for safety sake, I'll use the (much shorter) list from older versions of MATLAB.

I'm still curious what effect the "machineformat" has on 'fread' other than selecting between big-endian and little-endian.

- Ben

Subject: machineformat and 'long' precision (32bit vs 64bit)

From: dpb

Date: 11 Aug, 2011 13:16:42

Message: 6 of 6

On 8/10/2011 11:33 PM, Benjamin Kraus wrote:
> dpb <none@non.net> wrote in message <j1vg5u$8k7$1@speranza.aioe.org>...
>> You missed the part that says
>>
>> "The following platform dependent formats are also supported but
>> they are not guaranteed to be the same size on all platforms.
>
> Interesting, I missed that part because it doesn't seem to be in the
> documentation anymore. I just check the documentation for older versions
> of MATLAB. The documentation for R2009a contains that disclaimer.
>
> http://www.mathworks.com/help/releases/R2009a/techdoc/ref/fread.html
>
> As of R2009b they reformatted the File I/O pages of the documentation
> and removed that disclaimer. The current documentation has a very
> different table. The only two 'precisions' that it lists as "system
> dependent" are 'ulong' and 'long'. The table gives a definitive size for
> the remaining precisions, and includes the note "long and ulong are 32
> bits on 32-bit systems, and 64 bits on 64-bit systems." However, it
> seems that is not necessarily the case. Perhaps the old disclaimer
> should be restored.
>
> http://www.mathworks.com/help/techdoc/ref/fread.html
>
> As you suggest, I guess I will stick with the precisions that have a
> guaranteed size, and for safety sake, I'll use the (much shorter) list
> from older versions of MATLAB.
>
> I'm still curious what effect the "machineformat" has on 'fread' other
> than selecting between big-endian and little-endian.
...

That is definitely a peculiar omission it would seem. Worthy of a query
to official TMW support I would think.

I don't have recent release; and I've not done a test to confirm so I
_could_ be in error but--iirc the 'machineformat' will set what a
default real format will be in fread() when use a numeric scan format.
It controls whether is IEEE or one of the supported specific machine
formats that is not IEEE and the byte order.

I think in order to test it you will have to write data that is in the
proper form for a float of the given type else't you'll get NaN or
errors trying to store malformed floating point values. You can't just
look at an arbitrary bit stream.

--

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us