- A calculation running on a GPU is usually a parallel calculation. If it isn't, don't use a GPU!
- In any parallel calculation you cannot usually guarentee the order in which the calculations happen.
- Thanks to to how floating point arithmetic works, the order of calculations matters to the final result even in instances where you might not imagine it should matter.
Inquiry Regarding Minor Variations in MATLAB GPU Computation
1 view (last 30 days)
Show older comments
I am running an algorithm in MATLAB utilizing my system's GPU. For the same input, the results are generally identical. However, in some cases, I notice minor variations in the decimal values of the output. Can anyone please help me understand why this is happening?
0 Comments
Answers (1)
Mike Croucher
on 29 Jan 2025
It is difficult to comment without seeing the code but the most general thing I can think of saying goes as follows:
As an example to illustrate the final point, imagine the case where you have to add up a lot of numbers. To show how quickly 'interesting' things can happen lets consider a case where we only want to add up 3 numbers: 0.1, 0.2 and 0.3
We have two possible ways of proceeding:
x = 0.1 + (0.2 + 0.3); % Do 0.2 + 0.3 first
y = (0.1 + 0.2) + 0.3; % Do 0.1 + 0.2 first
% are they equal?
x==y
They are not equal and yet any mathematician will tell you that the order should not matter. What's going on?
The issue is related to the fact that all floating point numbers are represented in binary and you cannot represent 0.1 exactly in binary. You end up getting small round-off errors that accumulate. Another example that shows this is that the 64bit binary value thats closest to 0.1 is actually a little above 0.1
% You might expect this to be zero
res = 1 - (0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1) % That's 10 0.1's added up
To be clear, this is not a MATLAB-thing, this is a 'floating-point arithmetic' thing.
So, even adding up three numbers is sensitive to the order you do it in. In your GPU algorithm you are probably doing millions or even billions of computations in parallel. The order of operations changes from one run to another and so you'll get tiny differences in the output.
Sometimes, the differences will not be tiny!
When round-off errors 'blow up'
In pathological cases these tiny differences can 'blow up'. As a trivial example, say you do a complex computation and (ill-advisedly) base the next step of that calculation on whether or not the answer of your computation is exactly 0.6
function out = mikeIsCrazy(input)
if input <= 0.6
fprintf("Missile has launched\n")
out = 1000000;
else
fprintf("No threat detected\n")
out = 0;
end
end
We'll use our addition of three numbers as a proxy for our complex calculation
mikeIsCrazy(0.1 + (0.2 + 0.3))
mikeIsCrazy((0.1 + 0.2) + 0.3)
Fun fact: Logic like this formed the basis for my first-ever scientific computing trouble-shooting session when I first started working in academic computing support.
4 Comments
Mike Croucher
on 31 Jan 2025
What we are talking about here is a feature of floating point arithmetic, not hardware. So, yes.
How large are the differences? Can you post the code?
See Also
Categories
Find more on Linear Algebra in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!