sum() fails while operating on a large array of single floats

5 views (last 30 days)
For arrays of single floats with more than 2^28 elements, sum(D) fails.
D=rand(2^27,1,'single');
sum(D)
ans = 67106504
D=rand(2^28,1,'single');
sum(D)
ans = 134208512
D=rand(2^29,1,'single');
sum(D)
ans = 134217728
D=rand(2^29,1);
sum(D)
ans = 2.684394078227534e+08
The problem affects other functions, like mean(), but doesn't affect others like min() or max().
MATLAB 8.6.0.267246 (R2015b)
  9 Comments
Eric Tittley
Eric Tittley on 6 Apr 2016
James, replace the rand() with ones(). Same problem. Also note my earlier comment where I describe initially finding this problem in an array read from an HDF file. I used rand() to generate similar numbers. Thanks for looking and verifying. I'll be submitting a bug report.
Tania Kleynhans
Tania Kleynhans on 30 Jun 2016
Edited: Tania Kleynhans on 30 Jun 2016
I am experiencing a similar problem. I have a binary image cube (1024x1024x360), single precision.
sum(image(:))
I get a different answer than when I do
sum(sum(sum(image)))
When changing to int16 or double it works

Sign in to comment.

Accepted Answer

James Tursa
James Tursa on 5 Apr 2016
Edited: James Tursa on 5 Apr 2016
My guess is this is caused by using single precision accumulators for the sums at the low level code. If you were to write a simple loop to sum the elements up, after some point the accumulated sum would become large enough to swamp the rand values that are between 0 and 1. I.e., adding a value between 0 and 1 to the current accumulated sum would have no effect since these values are below eps(current_sum). For a single precision accumulator, there are 24 equivalent mantissa bits in IEEE. Since rand numbers average 0.5, one would expect the accumulated sum to reach the insensitive value at maybe 2^25 elements or so if using one accumulator. For multiple-core machines the summing might be multi-threaded so the sum could be chunked out to the cores. On a dual-core maybe the problem starts at 2^26, on a quad-core maybe the problem starts at 2^27, etc. This is close to what people are seeing, so maybe that is what is going on. E.g., on my R2051a 64-bit Windows 7 I get this:
>> D=rand(2^28,1,'single');
>> sum(D)
ans =
67108864
>> 2^27
ans =
134217728 % <-- expected sum
>> R = reshape(D,2^14,2^14);
>> sum(sum(R)) % <-- Force MATLAB to do the sum in many more chunks
ans =
134216360 % <-- close to expectation
>> mysum = 0; for k=1:numel(D); mysum = mysum + double(D(k)); end % brute force double accumulator
>> single(mysum)
ans =
134216672 % <-- again close to expectation
>> mysum = single(0); for k=1:numel(D); mysum = mysum + D(k); end % brute force single accumulator
>> mysum
mysum =
16777216 % <-- another bad answer
>> 4*mysum
ans =
67108864 % The original bad answer from above
% The number above seems to indicate that sum(D) was chunked out to 4 cores
>> sum(D,'double') % <-- Force MATLAB to use double precision accumulator
ans =
1.342166726798951e+08
>> single(ans)
ans =
134216672 % <-- Now we get expected answer
The bad answer one gets will be dependent on exactly how the summing gets chunked up at the lower level routine.
IMO, these types of operations should at the very least use double precision accumulators at the low level routines, which MATLAB doesn't do by default according to the doc. For user-written code I guess one can simple be aware of this pitfall and use custom code for large problems. But for MATLAB routines that rely on summing operations (e.g., mean) I would have thought TMW would force the accumulators to be double in the background, which apparently they didn't do. Frankly, I would have made the opposite choice ... make the default be double accumulators (best answer), and force the user to manually select single accumulators if that is how he/she wants the low level summing to be done.
The BLAS has such a routine for dot products (e.g., DSDOT does a dot product of single precision vectors using double precision accumulator), but I can find no such BLAS equivalent for a simple sum. So maybe sum is just low level C++ code written by TMW, multi-threaded but using single precision accumulators.
You should submit a bug report on this.
UPDATE:
Just checked on a different machine running R2015b 64-bit Win7 and there is no problem. So the low level accumulators apparently are double precision for this version. So there has been an apparent change in the underlying code, but I can find no mention of it in the Release Notes for R2015b.
  1 Comment
Eric Tittley
Eric Tittley on 6 Apr 2016
Thanks James. I believe this answer best and I'm thunking myself for not having thought of this. The answers vary from machine to machine depending on the number of threads. On my 8-core machine:
sum(D)*eps('single')
ans = 16
So it seems to be chunking the summation into twice the number of cores.

Sign in to comment.

More Answers (2)

Image Analyst
Image Analyst on 5 Apr 2016
The Mathworks knows about this. It's just a precision issue where small numbers don't get added to the accumulating sum if that sum gets too big. For example, look at this code:
% Take the mean of random numbers. It should be 0.500000
meanSingleSorted = mean(sort(single(rand(500000000,1))))
meanSingleUnsorted = mean(single(rand(500000000,1)))
meanDoubleSorted = mean(sort(rand(500000000,1)))
You'll see the initially unexpected results:
meanSingleSorted =
0.09227469
meanSingleUnsorted =
0.1342177
meanDoubleSorted =
0.499997472454575

Walter Roberson
Walter Roberson on 30 Jun 2016
sum() now has a "type" option for these cases. You can pass one of the following as the third argument to sum()
'double' - S has class double for any input X
'native' - S has the same class as X
'default' - If X is floating point, that is double or single,
S has the same class as X. If X is not floating point,
S has class double.
  1 Comment
James Tursa
James Tursa on 1 Jul 2016
This was alluded to in my Answer. But I still think TMW should have opted for the default to be double accumulator in the background.

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!