From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: sum array display a wrong result
Date: Tue, 26 Jul 2011 15:38:09 +0000 (UTC)
Organization: The MathWorks, Inc.
Lines: 23
Message-ID: <j0mn11$98t$>
References: <j08b27$i4u$> <j08m2p$ec1$> <j0ibtn$k6g$> <j0ik9u$8as$> <j0ios5$hu2$>
Reply-To: <HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: 1311694689 9501 (26 Jul 2011 15:38:09 GMT)
NNTP-Posting-Date: Tue, 26 Jul 2011 15:38:09 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1187260
Xref: comp.soft-sys.matlab:737651

"John Wong" wrote in message <j0ios5$hu2$>...
> >>but round-off errors in the regression computation have given you a sum equal to 2.2737e-013 rather than an ideal zero.
> I don't get this part. How can I avoid this issue when MATLAB has already done the internal round-off? 
- - - - - - - - - - -
  I'm not sure what it is that puzzles you, John.

  The formula for linear regression is designed so that the sum of the differences between your data points and the line of regression should add up to zero.  In the ideal world of precise mathematics this sum would indeed be exactly zero.  However in the process of performing the arithmetic operations to realize this line of regression, a computer with only a finite number of bits must necessarily suffer round off errors while performing this arithmetic, so that this sum will no longer be precisely zero.  In your case it apparently made an accumulated error of 2.2737e-013 in all the steps that led up to that final summation.  I would say that this is rather good considering the number of additions, multiplications, and divisions that were necessary to achieve your results.  After all, your computer is using only 53 bit accuracy in its calculations which is about one part in 10^15.

  However, it may possible to improve matters a little if you use a more robust method of calculation than you have shown in the website you referenced.  In that code you did this in effect:

 Sxy = sum(x.*y) - sum(x)*sum(y)/n
 Sxx = sum(x.^2) - sum(x)^2/n

It is better to do things this mathematically equivalent way:

 Sxy = sum((x-sum(x)/n).*(y-sum(y)/n));
 Sxx = sum((x-sum(x)/n).^2);

That is, subtract mean values before you take the products or squares.  From a numerical analysis point of view the errors generated tend to be smaller this way, particularly if the values of x and y are strongly biased away from zero, as in your case.

  Aside from such techniques, there is nothing you can do about such errors as these except to perform arithmetic with a higher accuracy.  That is possible with matlab's symbolic toolbox but only at the cost of very much slower computation times.

Roger Stafford