15 views (last 30 days)

Why am I getting unexpected results while working with functions such as 'mean' for array containing large numerical values?

Input these commands in the MATLAB 2019a

>> test = [2^1000, 2^1000];

>> testMean = mean(test);

Now, if testMean is actually the mean of test, then subtracting it from test and taking the mean a subsequent time, should result in 0.

>> zeroCenteredTest = test-testMean;

>> zeroCenteredTestMean = mean(zeroCenteredTest);

You will find zeroCenterTestMean is in fact 0. This case works correctly. However, let me demonstrate a case that does not work correctly.

Let me generate a large 1D matrix with a large amount of large values

>> test2 = [];

>> for i=1:1000

test2(i) = 2^i;

end

>> test2Mean = mean(test2);

If this mean was calculated correctly, the result should be 0.

>> zeroCenteredTest2 = test2-test2Mean;

>> zeroCenteredTest2Mean = mean(zeroCenteredTest2);

You will find in this case, the zeroCenteredTest2Mean is -1.66545893749512e+283 which is definitely not or even close to 0.

MathWorks Support Team
on 9 Aug 2019

This behavior is because of floating point rounding errors which is inherent to the nature of floating point number representation and is not a bug. MATLAB uses IEEE 754 standard to represent floating point numbers and since limited number of bits are available to represent a number, larger the value gets, more will be the rounding errors in representing it.

MATLAB has a function called 'eps' which takes in a number and gives the difference between the given number and the next re-presentable number. To get more insight on how 'eps' is related to rounding errors, please refer to this link - https://www.mathworks.com/help/matlab/matlab_prog/floating-point-numbers.html#f2-98690

In the example you provided, the 'eps' value of 2^1000 is 2.3792e+285 which indicates that the rounding errors will be of similar magnitude as well and hence you are getting unexpected results using mean. (Additional explanation can be found at the end of post)

In case you want to work with accurate representation of numbers, you can use the Symbolic Math Toolbox. For more information, refer to https://www.mathworks.com/help/symbolic/create-symbolic-numbers-variables-and-expressions.html#buyfu27

The symbolic number is represented in exact rational form, while the floating-point number is a decimal approximation.

In your example, you can convert 'test2' to symbolic representation to get expected results.

>> test2 = [];

>> for i =1:1000

test2(i) = 2^i;

end

>> test2Symbolic = sym(test2);

>> test2MeanSymbolic = mean(test2Symbolic);

>> zeroCenteredTest2 = test2Symbolic-test2MeanSymbolic;

>> zeroCenteredTest2Mean = mean(zeroCenteredTest2);

>> zeroCenteredTest2Mean

zeroCenteredTest2Mean =

0

For additional explanation in this example, if we break down steps to calculate the mean by first calculating the sum and then dividing the value by 1000, you will see that the first step is calculated correctly i.e 'sum(test2)' is correct value since there is no fractional part and hence no rounding error. Next, when you divide this value by 1000, the result will include fractional part as well and since its true value falls between some large number and 'eps' of that number, it will get rounded to nearest representation introducing rounding error of similar magnitude which explains the unexpected results.

Opportunities for recent engineering grads.

Apply Today
## 0 Comments

Sign in to comment.