Inconsistent matrix multiplication results

8 views (last 30 days)
Sergey
Sergey on 2 Feb 2012
I noticed that some code I was running returned different results between runs. I seed all random number generators with a constant, so that shouldn't be the issue. I finally tracked down the problem (or at least one case of it) to a single matrix operation, of the form y1/(y'*y), where y1 is a scalar and y is a column vector. For two seemingly identical values of y, y'*y produced answers that differed in the last decimal place. If I reassigned y - i.e. d = y, d'*d, the answer was the same as y'*y. If I print out y with full precision (20 decimal places) and assign it to d, then d'*d was different from y'*y, even though y == d was all 1.
If I saved the workspace and loaded it, y'*y was different (not equal to the old y'*y), but y itself was still the same.
I am using r2011b, and this issue is somewhat annoying, because these discrepancies add up and I get inconsistent results between runs.
Has anyone seen this before, and do you have any advice? Could this be a bug in r2011b?
EDIT: y and d arrays are bitwise identical (using hex printout), but y'y and d'd differ in the last bit; saving & loading the workspace makes y'y identical to d'd, and this error appears to be nondeterministic. In successive runs, the error in the last bit may or may not occur, and appears somehow "tied" to the array y - i.e. if it occurs, successive calls to y'y produce the same (incorrect) answer, but loading the workspace with bitwise identical y produces the correct answer.
I suspect that Matlab keeps some internal flags that describe the order in which the sum in y'*y should be computed (resulting in different truncation error), and these flags are somehow associated with the array. Reloading the workspace must reset these flags. However, for whatever reason, between successive runs of my function, this is not consistent.
EDIT 2: I can confirm that this does not happen in R2010b or R2009b. So this appears to be a new bug introduced in R2011b.
  1 Comment
Walter Roberson
Walter Roberson on 2 Feb 2012
http://matlab.wikia.com/wiki/FAQ#Why_is_0.3_-_0.2_-_0.1_.28or_similar.29_not_equal_to_zero.3F

Sign in to comment.

Answers (2)

John D'Errico
John D'Errico on 2 Feb 2012
Sigh. No. This is NOT a bug in r2011b. This is NOT a bug. It is merely your not recognizing that this is floating point arithmetic. And the fact is, even if you print out 20 decimal digits of y, you have not encapsulated the true exact value of y!!!!!!!
For example, consider this number:
N = 1.23456789012346;
The decimal representation of that number was simple. Yet matlab converts it to a binary form, because all numbers are stored in that form internally. The value that matlab actually uses is:
hpf(N)
ans =
1.2345678901234560242983206990174949169158935546875
As you can see, they differ.
Welcome to the world of floating point arithmetic.
  9 Comments
Sergey
Sergey on 2 Feb 2012
all(y(:)==d(:)) is true, and the variables are identical to the last bit.
It's interesting that the expressions can get evaluated differently, but I don't think any of the examples you gave are happening here. It really looks like the result is somehow "tied" to the original array. I can break in with the debugger after I get the "bad" answer (40b5c49c7eb4bc52), and just enter y'*y in the console and get 40b5c49c7eb4bc52 every time, then I either enter the identical hex values to get d, or do d = y, save d to a .mat file and load it again, and then d'*d will evaluate to 40b5c49c7eb4bc53 every time from the console, while y'*y still evaluates to 40b5c49c7eb4bc52.
But what's causing this is definitely something that was introduced in either R2011a or R2011b, because R2010b does not have this issue.
James Tursa
James Tursa on 2 Feb 2012
Maybe it is not doing y'*y at all ... the optimization may recognize that y hasn't changed and it is just reporting the result of an earlier evaluation of this product (that did it in a different order). Try tic;y'*y;toc vs tic;d'*d;toc to see if there is a difference in timing that might yield a very small value for the first one. Also try y(1) = y(1) (forces optimization code to think that the variable has changed) and then do y'*y again.

Sign in to comment.


Jan
Jan on 2 Feb 2012
It would be helpful, if you post a small piece of code, which reproduces the problem. Then we could investigate what's going on, e.g. by checking the mxArray headers or calling the BLAS functions directly from a Mex function.
If you cannot reproduce the problem with manually created data, one of the following problem can be the source of the different results:
  1. An external program modifies the floating point environment, e.g. the rounding direction, perhaps temporarily.
  2. The floating point unit of your processor is near to melting. This happened to me one time - but after some hours the processor died completely...

Categories

Find more on Performance and Memory in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!