Why is the smallest positive number in 8-byte 2^-1022 and not 2^-1024?

11 views (last 30 days)
Background:
In the manner in which a floating point number is stored in an 8-byte word in IEEE double precision format, we have 1 bit dedicated to the sign of the number, 11 bits for the signed exponent (i.e. 1 bit for the sign of the exponent, and 10 bits for the unsigned exponent numeric value), and 52 bits for the mantissa. In this fashion, the 11 bits used for the exponent translates into a range from -1022 to 1023 as bearable values for the exponent in the floating point representation. This leads to the number +1.0000 ... 0000 × 2^−1022 (52 zeros after the +1.) as the smallest positive number that can be represented using a double precision floating point representation. This number is saved in Matlab into the variable realmin.
Source: Applied Numerical Methods with Matlab, Steven Chapra, pg.100. McGraw-Hill Education, Jan. 2011.
My Question:
Why is the range for the exponent from -1022 to 1023 and not from -1024 to 1023? This would make the smallest positive number that can be represented using a double precision floating point representation be +1.0000 ... 0000 × 2^ −1024 (52 zeros after the +1.) instead of +1.0000 ... 0000 × 2^−1022 (52 zeros after the +1.) My motive for asking this question is that, the exponent storage using the 11 bits mentioned above is done in the same way in which integers are stored. Therefore, I was expecting a range from -2^(11-1) to [2^(11-1) -1], i.e from -1024 to 1023, the additional count on the negative side due to the fact that we would have had a word (binary string) being used to represent +0 and another word used to represent -0; hence, we just use the unnecessary duplicate to represent one more number on the negative side (-1024). So, why 2^-1022 and not 2^-1024?
Also, an additional question that I have is the following: Why is eps 2^-52 and not (2^-52)*(2^-1022), assuming that 2^-1022 is indeed the smallest positive number representable using double precision, and not 2^-1024 as I am eluding it should be.
I would really appreciate your help on clarifying this matter to me.
Thanks in advance!

Accepted Answer

Jan
Jan on 19 Oct 2014
Your question does not have a direct connection to Matlab.
The IEEE 754 standards are explained in many places in the net, e.g. http://en.wikipedia.org/wiki/IEEE_754-1985. Beside the exponent and the mantissa, the standard condsiders signaling and non-signaling NaN's, Inf and denormalized numbers. Therefore some bits cannot be used in the way you expect.
EPS is defined as the smallest number for which 1 + eps > 1. For smaller numbers the different scales of the numbers cannot be considered in the result, such that the larger number dominates and the smaller one is truncated destructively.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!