huge differences in single vs double precision math

8 views (last 30 days)
I am calculating a sum of squares in 32-bit FP precision (for comparison with a GPU algorithm, which isn't relevant here).
Here is the code:
Y=single((0:499).^2);
sum(Y)
ans =
41541684
sum(double(Y))
ans =
41541750
The (correct) double answer is off by 66! The largest value, 499^2 = 249001, is nowhere near any FP limits.
This is R2013A on OS X 10.9.

Answers (1)

John D'Errico
John D'Errico on 7 Aug 2014
What you don't understand is that single precision has a 23 bit mantissa. While there are 32 total bits stored in a single, don't forget that one of those bits is a sign bit, which leaves 8 bits to store an exponent in a biased form. So you cannot store an INTEGER larger than 2^24-1 in a single, if you wish to do so without error.
The sum you formed was larger than that limit, so you should expect an error.
log2(41541750)
ans =
25.308
It is time for you to start reading about floating point arithmetic.
Computers are not all powerful, except for those in the movies/tv.

Categories

Find more on Mathematics in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!