Kahan summation algorithm

In numerical analysis, the Kahan summation algorithm, also known as compensated summation,[1] significantly reduces the numerical error in the total obtained by adding a sequence of finite-precision floating-point numbers, compared to the naive approach.

, so a large number of values can be summed with an error that only depends on the floating-point precision of the result.

[4] Similar, earlier techniques are, for example, Bresenham's line algorithm, keeping track of the accumulated error in integer operations (although first documented around the same time[5]) and the delta-sigma modulation.

[3] Computers typically use binary arithmetic, but to make the example easier to read, it will be given in decimal.

With a plain summation, each incoming value would be aligned with sum, and many low-order digits would be lost (by truncation or rounding).

Thus the summation proceeds with "guard digits" in c, which is better than not having any, but is not as good as performing the calculations with double the precision of the input.

A careful analysis of the errors in compensated summation is needed to appreciate its accuracy characteristics.

While it is more accurate than naive summation, it can still give large relative errors for ill-conditioned sums.

Essentially, the condition number represents the intrinsic sensitivity of the summation problem to errors, regardless of how it is computed.

[8] The relative error bound of every (backwards stable) summation method by a fixed algorithm in fixed precision (i.e. not those that use arbitrary-precision arithmetic, nor algorithms whose memory and time requirements change based on the data), is proportional to this condition number.

On the other hand, for random inputs with nonzero mean the condition number asymptotes to a finite constant as

Given a condition number, the relative error of compensated summation is effectively independent of

, but in practice this term is effectively zero: since the final result is rounded to a precision

In comparison, the relative error bound for naive summation (simply adding the numbers in sequence, rounding at each step) grows as

above is a worst-case bound that occurs only if all the rounding errors have the same sign (and are of maximal possible magnitude).

So, even for asymptotically ill-conditioned sums, the relative error for compensated summation can often be much smaller than a worst-case analysis might suggest.

Neumaier[10] introduced an improved version of Kahan algorithm, which he calls an "improved Kahan–Babuška algorithm", which also covers the case when the next term to be added is larger in absolute value than the running sum, effectively swapping the role of what is large and what is small.

For many sequences of numbers, both algorithms agree, but a simple example due to Peters[11] shows how they can differ: summing

[2] This has the advantage of requiring the same number of arithmetic operations as the naive summation (unlike Kahan's algorithm, which requires four times the arithmetic and has a latency of four times a simple summation) and can be calculated in parallel.

The equivalent of pairwise summation is used in many fast Fourier transform (FFT) algorithms and is responsible for the logarithmic growth of roundoff errors in those FFTs.

[9] Another alternative is to use arbitrary-precision arithmetic, which in principle need no rounding at all with a cost of much greater computational effort.

A way of performing correctly rounded sums using arbitrary precision is to extend adaptively using multiple floating-point components.

This will minimize computational cost in common cases where high precision is not needed.

[14][11] Another method that uses only integer arithmetic, but a large accumulator, was described by Kirchner and Kulisch; [15] a hardware implementation was described by Müller, Rüb and Rülling.

[16] In principle, a sufficiently aggressive optimizing compiler could destroy the effectiveness of Kahan summation: for example, if the compiler simplified expressions according to the associativity rules of real arithmetic, it might "simplify" the second step in the sequence to and then to thus eliminating the error compensation.

A portable way to inhibit such optimizations locally is to break one of the lines in the original formulation into two statements, and make two of the intermediate products volatile: In general, built-in "sum" functions in computer languages typically provide no guarantees that a particular summation algorithm will be employed, much less Kahan summation.

[citation needed] The BLAS standard for linear algebra subroutines explicitly avoids mandating any particular computational order of operations for performance reasons,[24] and BLAS implementations typically do not use Kahan summation.

The standard library of the Python computer language specifies an fsum function for accurate summation.

[25] In the Julia language, the default implementation of the sum function does pairwise summation for high accuracy with good performance,[26] but an external library provides an implementation of Neumaier's variant named sum_kbn for the cases when higher accuracy is needed.

[27] In the C# language, HPCsharp nuget package implements the Neumaier variant and pairwise summation: both as scalar, data-parallel using SIMD processor instructions, and parallel multi-core.