on floating point arithmetic - moler book

3 views (last 30 days)
I am referring to NCM book by Moler which on page 37 says - "The entire fractional part of a floating-point number is not f, but 1+f, which has 53 bits. However, the leading 1 doesn’t need to be stored. In effect, the IEEE format packs 65 bits of information into a 64-bit word." I did not understand the meaning of this. How can you pack 65 bits of information into a 64-bit word?

Answers (1)

Roger Stafford
Roger Stafford on 14 Dec 2014
Edited: Roger Stafford on 14 Dec 2014
Yes, Cleve Moler is correct that the significand (mantissa) portion of IEEE 754 double precision normalized numbers represent 53 bits of which only 52 are present explicitly. The highest bit in not actually present because it is always a 1 for such numbers. A normalized number's significand always has a 1 bit to the left of the binary point with a 52-bit fractional part to the right. Whether the number is to be regarded as normalized or not is information that is contained in its exponent portion which occupies 11 bits. If that exponent is any value from 1 to 2046, which are offset and actually represent powers of 2 from -1022 to +1023, then the number is considered normalized. If the exponent is 0, then the number is regarded as "denormalized" and its highest bit is understood to be 0, with the remaining 52 bits containing all of the actual significand value. In other words, the information about this mysterious "hidden" bit is actually present implicitly in the exponent, and it doesn't really pack 65 bits of information into 64 bits.
  2 Comments
Seetha Rama Raju Sanapala
Seetha Rama Raju Sanapala on 14 Dec 2014
Edited: Seetha Rama Raju Sanapala on 14 Dec 2014
Does this mean that in this scheme of things, the mantissa can be +ve only when the exponent is 0. Otherwise, the mantissa has to be negative (because the implicit 53 bit is 1). How do we represent the numbers that are positive mantissa and non-zero exponent?
Thanks for your explanation. Now, I feel that I am on the way to crack this IEEE representation.
Roger Stafford
Roger Stafford on 14 Dec 2014
No, there is a separate bit that represents the number's sign. It has nothing to do with the 'hidden' bit of the significand.
You can use matlab's "format hex" to give a hexadecimal representation of the actual bits that are stored for a 'double' number. For example, if we display 42 (made famous by Douglas Adams,) we get
format hex
x = 42
> 4045000000000000
Translating the hexadecimal to binary and separating the sign, exponent, and significand parts gives:
0 10000000100 0101000000000000000000000000000000000000000000000000
The first 0 means it is positive. The next 11 bits are 1028 in decimal which if we subtract the standard offset of 1023 gives a two's exponent of 5. Since the exponent was not all 0's, the hidden bit is 1, which gives a significand of 1.0101000000.... Thus our number is 2^5*1.01010 = 101010, which is binary for 42.
There is a Wikipedia article on the internet at
http://en.wikipedia.org/wiki/IEEE_754-1985
that gives a brief description of the double precision floating point format which you might be interested in reading.

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!