Why is the smallest positive number in 8-byte 2^-1022 and not 2^-1024?

Question

MatlabFan on 19 Oct 2014

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/159195-why-is-the-smallest-positive-number-in-8-byte-2-1022-and-not-2-1024

Answered: Jan on 19 Oct 2014

Background:

In the manner in which a floating point number is stored in an 8-byte word in IEEE double precision format, we have 1 bit dedicated to the sign of the number, 11 bits for the signed exponent (i.e. 1 bit for the sign of the exponent, and 10 bits for the unsigned exponent numeric value), and 52 bits for the mantissa. In this fashion, the 11 bits used for the exponent translates into a range from -1022 to 1023 as bearable values for the exponent in the floating point representation. This leads to the number +1.0000 ... 0000 × 2^−1022 (52 zeros after the +1.) as the smallest positive number that can be represented using a double precision floating point representation. This number is saved in Matlab into the variable realmin.

Source: Applied Numerical Methods with Matlab, Steven Chapra, pg.100. McGraw-Hill Education, Jan. 2011.

My Question:

Why is the range for the exponent from -1022 to 1023 and not from -1024 to 1023? This would make the smallest positive number that can be represented using a double precision floating point representation be +1.0000 ... 0000 × 2^ −1024 (52 zeros after the +1.) instead of +1.0000 ... 0000 × 2^−1022 (52 zeros after the +1.) My motive for asking this question is that, the exponent storage using the 11 bits mentioned above is done in the same way in which integers are stored. Therefore, I was expecting a range from -2^(11-1) to [2^(11-1) -1], i.e from -1024 to 1023, the additional count on the negative side due to the fact that we would have had a word (binary string) being used to represent +0 and another word used to represent -0; hence, we just use the unnecessary duplicate to represent one more number on the negative side (-1024). So, why 2^-1022 and not 2^-1024?

Also, an additional question that I have is the following: Why is eps 2^-52 and not (2^-52)*(2^-1022), assuming that 2^-1022 is indeed the smallest positive number representable using double precision, and not 2^-1024 as I am eluding it should be.

I would really appreciate your help on clarifying this matter to me.

Thanks in advance!

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Jan on 19 Oct 2014

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/159195-why-is-the-smallest-positive-number-in-8-byte-2-1022-and-not-2-1024#answer_155780

Your question does not have a direct connection to Matlab.

The IEEE 754 standards are explained in many places in the net, e.g. http://en.wikipedia.org/wiki/IEEE_754-1985. Beside the exponent and the mantissa, the standard condsiders signaling and non-signaling NaN's, Inf and denormalized numbers. Therefore some bits cannot be used in the way you expect.

EPS is defined as the smallest number for which 1 + eps > 1. For smaller numbers the different scales of the numbers cannot be considered in the result, such that the larger number dominates and the smaller one is truncated destructively.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Why is the smallest positive number in 8-byte 2^-1022 and not 2^-1024?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

Why is the smallest positive number in 8-byte 2^-1022 and not 2^-1024?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments