XSum - SUM with error compensation

The accuracy of the sum of floating point numbers is limited by the truncation error. E.g. SUM([1e16, 1, -1e16]) replies 0 instead of 1 and the error of SUM(RANDN(N, 1)) is about EPS*(N / 10).

Kahan, Knuth, Dekker, Ogita and Rump (and others) have derived some methods to reduce the influence of rounding errors, which are implemented here as fast C-Mex: XSum(RANDN(N, 1), 'Knuth') is exact to all 15 digits.

Y = XSum(X, N, Method)

INPUT:

X: Double array of any size.

N: Dimension to operate on.

Method: String: 'Double', 'Long', 'Kahan', 'Knuth', 'KnuthLong', 'Knuth2'.

OUTPUT:

Y: Double array, equivalent to SUM, but with compensated error depending

on the Method. The high-precision result is rounded to double precision.

METHODS: (speed and accuracy compared to SUM)

- Double: A single threaded implementation of Matlab's SUM. At least in Matlab 2008a to 2009b the results of the multi-threaded SUM can differ slightly from call to call. Equivalent accuracy. 1.1 to 2 times slower than SUM.

- Long: Accumulated in a 80 bit long double, if the compiler support this (e.g. LCC v3.8). 3.5 more valid digits, 2 times slower.

- Kahan: The local error is subtracted from the next element. 1 to 3 more valid digits, 2 to 9 times slower.

- Knuth: As if the sum is accumulated in a 128 bit float: about 15 more valid digits. 1.4 to 4 times slower. This is suitable for the most real world problems.

- Knuth2: 30 more valid digits as if it is accumulated in a 196 bit float. 2 to 8 times slower.

- KnuthLong: As Knuth, but using long doubles to get about 21 more valid digits, if supported by the compiler. 2.5 times slower.

COMPILATION: mex -O XSum.c

Tested: Matlab 6.5, 7.7, 7.8, WinXP, BCC5.5, LCC2.4/3.8, Open Watcom 1.8, MSVC++ 2008

TestXSum checks the validity, speed and accuracy after compiling (see screen shot).

Pre-compiled MEX files: http://www.n-simon.de/mex

References: Takeshi Ogita and Siegfried M. Rump and Shin'ichi Oishi: "Accurate Sum and Dot Product with Applications"

See also: INTLAB, S. M. Rump: http://www.ti3.tu-harburg.de/rump/intlab

Jan (2021). XSum (https://www.mathworks.com/matlabcentral/fileexchange/26800-xsum), MATLAB Central File Exchange. Retrieved .

Created with
R2011a

Compatible with any release

**Inspired by:**
A forward stable linear solver, Sum benchmark

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!Create scripts with code, output, and formatted text in a single executable document.

Andre SouzaThis file was my first contact with C and MATLAB C API. I have learned A LOT by studying it. Thank you for making it available.

Harish VJohannes UhlProblem solved.

Solution: Download pre-compiled MEX-file from website namend above

Johannes UhlI also get the error:

Error using InstallMex>error_L (line 249)

*** InstallMex: Unknown string or missing file: uTest_XSum

Error in InstallMex (line 93)

error_L('MissFile', 'Unknown string or missing file: %s', Arg);

Error in XSum (line 103)

Ok = InstallMex('XSum.c', 'uTest_XSum');

What am I doing wrong?

Thanks for any help!

VinceJan@Brian: Do you have any problems with this code?

BrianMagloire LOUDEGUI DJIMDOUProblem solved.

Magloire LOUDEGUI DJIMDOUWhile trying on Unix 64

>> XSum(1)

I get the errors:

Error using InstallMex>error_L (line 249)

*** InstallMex: Unknown string or missing file: uTest_XSum

Error in InstallMex (line 93)

error_L('MissFile', 'Unknown string or missing file: %s', Arg);

Error in XSum (line 103)

Ok = InstallMex('XSum.c', 'uTest_XSum');

bmv's solution (on 16 Jun 2014) is helpless.

Any help appreciated.

bmv@Jan Simon:

Release at first the simplest version of XCumSum for single and double data!

Jan@bmv: I've struggled a lot with a string comparison command, which does not depend on the platform.

This tool was developped for experiments. Obviously Knuth and Knuth2 are worth to be applied in productive code, so I think about a multi-threaded version for a fair speed.

I started to write an XCumSum also, but stuck on the decision, if it should offer saturated arithmetics for integer types also.

@Moshe: You find my email address in the help section of the code. The uppercase characters defend spam bots and must be replaced obviously.

bmvIs there fast Xcumsum version?

bmvSolved by:

// Case-insensitive string comparison depends on the compiler: -----------------

// strncmpi, strnicmp, _strnicmp, ...

#define STRCMP_NI _strnicmp

to

#define STRCMP_NI strncasecmp

bmvFor Linux: x86_64

XSum

Error using InstallMex>error_L (line 249)

*** InstallMex: Unknown string or missing file: uTest_XSum

Error in InstallMex (line 93)

error_L('MissFile', 'Unknown string or missing file: %s', Arg);

Error in XSum (line 103)

Ok = InstallMex('XSum.c', 'uTest_XSum');

For:

Ok = InstallMex('XSum.c');

*** Compilation failed:

Error using mex

/tmp/mex_2309150000725708_22138/XSum.o: In function `mexFunction':

XSum.c:(.text+0xd80): undefined reference to `_strnicmp'

XSum.c:(.text+0xda2): undefined reference to `_strnicmp'

XSum.c:(.text+0xdc4): undefined reference to `_strnicmp'

XSum.c:(.text+0xde6): undefined reference to `_strnicmp'

XSum.c:(.text+0xe08): undefined reference to `_strnicmp'

/tmp/mex_2309150000725708_22138/XSum.o:XSum.c:(.text+0xe2a): more undefined references to `_strnicmp' follow

bmvMosheMosheI should add that without all these string comparisons, which evidently works slightly differently in the mac c compiler, this runs beautifully (if I just decide to stick with the default Knuth method and comment out the part determining the method).

MosheThanks for the response, and for putting together a useful utility in the first place. I don't have your email address so let me try one more comment here. When I compile the new version, I get a single error message this time (I have the standard C compiler that comes with X-code, I can also install the Intel compiler if needed).

warning: implicit declaration of function '_strnicmp' is invalid in C99 [-Wimplicit-function-declaration]

if (STRCMP_NI(Method_In, "Knuth", Method_LEN) == 0)

Thanks!

Moshe

Jan@Moshe: The problems you observe depend on the compiler of the Mac. Please try the new version when it appears in the next days. In case of further problems, send me an email.

MosheThis looks potentially extremely useful, but does not compile for me (Matlab 2014a on a mac). The errors are "implicit declaration of function 'strncmpi' is invalid", "implicit declaration of function '_control87' is invalid", and the use of use of undeclared identifiers, such as 'MCW_PC', 'PC_24' etc. Any help would be appreciated, though it is understandable if this works only for older versions of Matlab. Apologies also if these are simple issues, my C is rusty (non-existent, really).

Umberto PicchiniThis is just awesome. I cannot comment in detail as I am not an expert of the underlying methodology. However for me it just worked fine in returning finite results for sums of huge numbers. Thanks very much!

Jan@Ahmed: Perhaps strncmpi will work instead of strnicmp. As far as I understood from a diskussion about another C-mex file, Linux drives the processor with 80-bit precision ever. In addition some Windows compilers do not benefit from the higher precision, e.g. MSVC. So I suggest to disable the SetPrecision() subfunction completely under Linux.

I'm going to inlcude this is a new submission soon.

Ahmed FasihHints on compiling in Linux would be appreciated, from Jan or others. After redefining strnicmp with strcmp, gcc 4.4.1 fails because of undeclared symbols, PC_24, PC_64, PC_53, _control87, and MCW_PC. Thanks.

JanDear Bruno! Thanks for the comment!

You've observed, that your 32 bit system has a higher accuracy than your 64 bit system for "Double" and "Long". (For the MSVC++ 2008 compiler long doubles are doubles, so both methods reply the same result.)

It seems, that your 32 bit system accumulates the sum (e.g. in DoubleDim1()) in a 80 bit register, while your 64 bit system applies standard 53 bit rounding. Please check your mexopts.bat files for the compiler flags /arch:SSE2 and /fp:[precise | fast | strict]. Perhaps a "#pragma fenv_access (off)" in a header?!

You hit two important points:

1. The results of the trivial "for (i = 0; i < N; Sum += X[i++])" depends critically on the compiler directives. I've tested XSum with LCC2.4, LCC3.8, BCC5.5, OpenWatcom 1.8 and MSVC 2008 SP1 and found 5 different behaviours for "Double", "Long" and "KnuthLong". The accuracy of "Knuth" and "Knuth2" was not significantly affected by the compilers in my tests.

2. If the compiler has such an influence, it is hard to assess the quality of the results automatically. I did not found a way to let the test function draw user-friendly conclusions about the accuracy.

Summation is numerically instable and even the trivial algorithms of XSum(Double) and XSum(Long) are fragile if different compilers and floating point environments are applied (I have to stress this more explicite in the help. Thanks!). Therefore I use XSum(Knuth) and XSum(Knuth2) to improve the stability.

Bruno LuongHello Jan,

I'm not sure if I interpret the TestXSum correctly, but many methods ran under 2010A 64-bit Matlab seems to get worse accuracy than 2009B 32-bit Matlab (see below). Any idea why?

------------------------------------------------------------------------

32-bit 2009B, MSVC 2008 SP1

RANDN: Absolute results, sum estimated by Knuth2, average over 20 runs:

Smaller results are better!

0 must be replied for Knuth2, but even Knuth should be exact.

X = RANDN(1, 1E4):

SUM : 2.0996538E-013

SUM(SORT): 1.1599255E-011

Double : 0

Long : 0

Kahan : 2.3092639E-015

Knuth : 0

Knuth2 : 0

X = RANDN(1, 1E5):

SUM : 2.5210056E-012

SUM(SORT): 4.2349626E-010

Double : 0

Long : 0

Kahan : 1.2478907E-014

Knuth : 0

Knuth2 : 0

X = RANDN(1, 1E6):

SUM : 1.551328E-011

SUM(SORT): 6.7532937E-009

Double : 1.0658141E-015

Long : 1.0658141E-015

Kahan : 1.3500312E-014

Knuth : 0

Knuth2 : 0

------------------------------------------------------------------------

64-bit 2010A, MSVC 2008 SP1

RANDN: Absolute results, sum estimated by Knuth2, average over 20 runs:

Smaller results are better!

0 must be replied for Knuth2, but even Knuth should be exact.

X = RANDN(1, 1E4):

SUM : 2.0996538E-013

SUM(SORT): 1.1599255E-011

Double : 2.0996538E-013

Long : 2.0996538E-013

Kahan : 2.3092639E-015

Knuth : 0

Knuth2 : 0

X = RANDN(1, 1E5):

SUM : 2.5210056E-012

SUM(SORT): 4.2349626E-010

Double : 3.2762681E-012

Long : 3.2762681E-012

Kahan : 1.2478907E-014

Knuth : 0

Knuth2 : 0

X = RANDN(1, 1E6):

SUM : 1.551328E-011

SUM(SORT): 6.7532937E-009

Double : 2.3623414E-011

Long : 2.3623414E-011

Kahan : 1.3500312E-014

Knuth : 0

Knuth2 : 0