# how binary floating point to real decimal number representation ?

93 views (last 30 days)
Tamura Kentai on 12 Jul 2019
Commented: Guillaume on 12 Sep 2019
Hi, :
I'm sorry, if ask the wrong question, actually I don't know which domain it is, just know it's about math, logic, programming, which Matlab plays a very important part in the world.
My question is , in computer domain, there are floating point, eg: single/double precision floating point, which store variables in [sign, exponent, significand] format sized in [1, 11, 52] bits (for 64 bit precision double type) , refer to https://en.wikipedia.org/wiki/Double-precision_floating-point_format
How do those floating number stored in computer (normally, binary format) to represent in decimal (with fractional number , between 0 ~ 1), especially, those numbers been right of the '.' (radix).
For example, a floating number in binary, 1.111111 is in decimal = 1.9844, how do the computer represent the .9844 to us ? From many docs, they all say, it's ' 2^-1 + 2^-2 +...+ 2^-6', but that's not what I want to ask, it's most likely I am curious about how do the computer translate 0.5 + 0.25 + .... + 0.0156, in computer, they are binary, only 010101..., so when they do arithmetic, they are based on 0,1,0,1..., so that definitely won't really recognize the decimal, '0.5', '0.25', ...,etc, but when we do those simple directives in Matlab, bin2dec(), f_b2d, it quickly gives the answers, I am interested in 'who' do those binary to 'decimal-string-output' task, in which level ? compiler ? What's the domain talking about those domain ?
Thank you very much.
Best regards.

Stephen Cobeldick on 12 Jul 2019
"For example, a floating number in binary, 1.111111 is in decimal = 1.9844..."
Nope, you are confusing binary numbers with binary floating point numbers.
The value 1.9844 as Double binary floating point would actually be:
0 01111111111 1111110000000001101000110110111000101110101100011100
^ Sign bit
^^^^^^^^^^^ Exponent
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Fraction
And as Single binary floating point would be
0 01111111 11111100000000011010010
^ Sign bit
^^^^^^^^ Exponent
^^^^^^^^^^^^^^^^^^^^^^^ Fraction
How the Sign, Exponent, and Fraction are defined is explained in the link that you give in your question.
Tamura Kentai on 12 Jul 2019
Hi, Stephen:
Thank you. But Do you know how does the computer represent following,
0 01111111111 1111110000000001101000110110111000101110101100011100
^ Sign bit
^^^^^^^^^^^ Exponent
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Fraction
in 1.9844 on the screen ?
Best regards.

Guillaume on 12 Jul 2019
Edited: Guillaume on 12 Jul 2019
Yes, as Stephen said, be careful about the difference between binary numbers and IEEE storage. You can easily see the IEEE representation of a double number in matlab with:
>> n = 1.9844
>> dec2bin(typecast(n, 'uint64'), 64); %typecast to uint64 doesn't change the bits, so we can extract the bit pattern
ans =
'0011111111111111110000000001101000110110111000101110101100011100'
As for your question, the computer is never aware of the decimal representation of the number, and doesn't care about it. All math is done with the binary representation. Which is why people sometimes are surprised by the result they get (see why 0.3-0.2-0.1 is not 0) because the binary math differs from the decima math. It's only for display to the user/programmer that the numbers are converted to their textual decimal representation. This is done by well established code routine, if you want to know more search for the source code of fprintf (in C). Similarly, the textual decimal representation of numbers you enter is immediately converted to the binary representation.

Show 1 older comment
Guillaume on 12 Jul 2019
A quality implementation of floating point binary to text conversion is probably a lot of lines of code (you've got to consider NaN, Inf, denormalised numbers, etc) and really only of academical interest. You can trust your standard C library (or whatever runtime your language uses) to do the right thing.
James Tursa on 12 Sep 2019
Well, it turns out you can't trust MATLAB's dec2bin( ) function for 64-bit integer inputs on my machine (WIN64). What version are you running where you got the correct conversion?
Guillaume on 12 Sep 2019
Hum, did I make it up? (!)
Indeed dec2bin doesn't handle properly integers above flintmax (because internally it converts to double). The doc does warn you but not the function. It's something I've complained about to mathworks. It's not even consistent with dec2hex (which throws a warning) or dec2base (which throws an error).
Another way to get the correct binary representation of a number (on a little endian computer), which is probably what I did originally:
strjoin(flipud(cellstr(dec2bin(typecast(n, 'uint8'), 8))), '')

James Tursa on 12 Sep 2019
You can't use dec2bin( ) reliably for this conversion in all versions of MATLAB because it is limited by flintmax (see note at bottom of doc). I.e., even though it lists int64 and uint64 as acceptable data type inputs, it really can't handle all of the values properly. E.g., using a simple example where we expect all of the mantissa bits to be 1's
R2016a & R2019a WIN64:
>> num2hex(realmax)
ans =
7fefffffffffffff <-- We should expect lots of trailing 1's
>> dec2bin(typecast(realmax,'uint64'),64)
ans =
0111111111110000000000000000000000000000000000000000000000000000 <-- TOTALLY goofed up!
>> reshape(dec2bin(sscanf(num2hex(realmax),'%1x'),4)',1,[]) % do it one hex digit at a time
ans =
0111111111101111111111111111111111111111111111111111111111111111 <-- This is what we were expecting
Both R2016a and R2019a dec2bin( ) can't handle the large uint64 value properly.

R2013b

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!