These two sections of codes for getting the mean should give the same answer, but why don't they?

Question

Esob Elbat on 15 Jun 2021

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/857190-these-two-sections-of-codes-for-getting-the-mean-should-give-the-same-answer-but-why-don-t-they

Edited: Stephen23 on 16 Jun 2021

Apologies for the trivial and very possibly boring question, I'm very new to MATLAB; this comes from a MATLAB Fundamentals section task.

Here, the 315x4 matrix usage represents electricity use across four sectors, with each column representing one sector. The task asks the student to find the mean of each column, ignoring all NaN elements.

The given solution of the task is very straight-forward:

emean = mean(usage,"omitnan")

The answer is a row vector of four elements which represent the means of the four columns of usage, as expected.

Having forgot the existence of "omitnan", I used a much lengthier and relatively inefficient section of code:

RES = usage(:,1)
COM = usage(:,2)
IND = usage(:,3)
TOT = usage(:,4)
A = mean(RES(~ismissing(RES)))
B = mean(COM(~ismissing(COM)))
C = mean(IND(~ismissing(IND)))
D = mean(TOT(~ismissing(TOT)))
emean = [A B C D]

As far as I can tell, these two sections of codes should give the same output for emean, but for some reason, they don't. The first two elements differ by a very small amount.

What am I missing?

(For reference, I encountered this problem in Task 1 of the seventh page of Section 12.3 of MATLAB Fundamentals. MATLAB Fundamentals (mathworks.com))

5 Comments
Show 3 older commentsHide 3 older comments

Gatech AE on 16 Jun 2021

Edited: Gatech AE on 16 Jun 2021

@the cyclist, I looked at the data and the difference is 1E-8 relative a magnitude of 3E6, so it's the 15th digit of a double. Seems like error in 16 bit representation to me.

Stephen23 on 16 Jun 2021

Edited: Stephen23 on 16 Jun 2021

"should give the same answer..."

No, they should not: in general, there is no reason why two different algorithms operating on binary floating point numbers should provide exactly the same output values.

"...but why don't they?"

Different operations lead to different accumulated error. This is expected.

Learn about binary floating point numbers and how they behave:

https://www.mathworks.com/matlabcentral/answers/57444-faq-why-is-0-3-0-2-0-1-not-equal-to-zero

https://www.mathworks.com/matlabcentral/answers/316889-why-two-equal-numbers-are-not-equal

http://www.mathworks.com/matlabcentral/answers/102419-how-do-i-determine-if-the-error-in-my-answer-is-the-result-of-round-off-error-or-a-bug

http://www.mathworks.com/matlabcentral/answers/140656-about-numerical-difficulty-of-equality-comparison-operation-between-two-double-real-numbers-in-matlab

This is worth reading as well:

https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

http://faculty.tarleton.edu/agapie/documents/cs_343_arch/papers/1991_Goldberg_FloatingPoint.pdf

Sign in to comment.

Sign in to answer this question.