Division algorithm

The simplest division algorithm, historically incorporated into a greatest common divisor algorithm presented in Euclid's Elements, Book VII, Proposition 1, finds the remainder given two positive integers using only subtractions and comparisons: The proof that the quotient and remainder exist and are unique (described at Euclidean division) gives rise to a complete division algorithm, applicable to both negative and positive numbers, using additions, subtractions, and comparisons: This procedure always produces R ≥ 0.

When used with a binary radix, this method forms the basis for the (unsigned) integer division with remainder algorithm below.

By allowing one to subtract more multiples than what one currently has at each stage, a more freeform variant of long division can be developed as well.

The following algorithm, the binary version of the famous long division, will divide N by D, placing the quotient in Q and the remainder in R. In the following pseudo-code, all values are treated as unsigned integers.

Slow division methods are all based on a standard recurrence equation[2] where: Restoring division operates on fixed-point fractional numbers and depends on the assumption 0 < D < N.[citation needed] The quotient digits q are formed from the digit set {0,1}.

The algorithm is more complex, but has the advantage when implemented in hardware that there is only one decision and addition/subtraction per quotient bit; there is no restoring step after the subtraction,[3] which potentially cuts down the numbers of operations by up to half and lets it be executed faster.

Finally, quotients computed by this algorithm are always odd, and the remainder in R is in the range −D ≤ R < D. For example, 5 / 2 = 3 R −1.

To convert to a positive remainder, do a single restoring step after Q is converted from non-standard form to standard form: The actual remainder is R >> n. (As with restoring division, the low-order bits of R are used up at the same rate as bits of the quotient Q are produced, and it is common to use a single shift register for both.)

[5][6] The algorithm is named after D. W. Sweeney of IBM, James E. Robertson of University of Illinois, and K. D. Tocher of Imperial College London.

They all developed the algorithm independently at approximately the same time (published in February 1957, September 1958, and January 1958 respectively).

For example, when implementing radix-4 SRT division, each quotient digit is chosen from five possibilities: { −2, −1, 0, +1, +2 }.

This tolerance allows quotient digits to be selected using only a few most-significant bits of the dividend and divisor, rather than requiring a full-width subtraction.

The Intel Pentium processor's infamous floating-point division bug was caused by an incorrectly coded lookup table.

The steps of Newton–Raphson division are: In order to apply Newton's method to find the reciprocal of

(moreover it attempts to compute the exact reciprocal in one step, rather than allow for iterative improvements).

To obtain a result with a precision of 2n bits while making use of the second expression, one must compute the product between

, then: This squaring of the error at each iteration step – the so-called quadratic convergence of Newton–Raphson's method – has the effect that the number of correct digits in the result roughly doubles for every iteration, a property that becomes extremely valuable when the numbers involved have many digits (e.g. in the large integer domain).

Once within a bounded range, a simple polynomial approximation can be used to find an initial estimate.

The minimum of the maximum absolute value of the error is determined by the Chebyshev equioscillation theorem applied to

It is possible to generate a polynomial fit of degree larger than 2, computing the coefficients using the Remez algorithm.

The trade-off is that the initial guess requires more computational cycles but hopefully in exchange for fewer iterations of Newton–Raphson.

The following computes the quotient of N and D with a precision of P binary places: For example, for a double-precision floating-point division, this method uses 10 multiplies, 9 adds, and 2 shifts.

Typical values are: A quadratic initial estimate plus two cubic iterations provides ample precision for an IEEE double-precision result.

This causes the dividend to converge to the sought quotient Q: The steps for Goldschmidt division are: Assuming N/D has been scaled so that 0 < D < 1, each Fi is based on D: Multiplying the dividend and divisor by the factor yields: After a sufficient number k of iterations

[16][17] It is also known as Anderson Earle Goldschmidt Powers (AEGP) algorithm and is implemented by various IBM processors.

[18][19] Although it converges at the same rate as a Newton–Raphson implementation, one advantage of the Goldschmidt method is that the multiplications in the numerator and in the denominator can be done in parallel.

Methods designed for hardware implementation generally do not scale to integers with thousands or millions of decimal digits; these frequently occur, for example, in modular reductions in cryptography.

[b] As a concrete fixed-point arithmetic example, for 32-bit unsigned integers, division by 3 can be replaced with a multiply by ⁠2863311531/233⁠, a multiplication by 2863311531 (hexadecimal 0xAAAAAAAB) followed by a 33 right bit shift.

Such errors are particularly pronounced in iterative processes and when subtracting nearly equal values - is told loss of significance.

To mitigate these errors, techniques such as the use of guard digits or higher precision arithmetic are employed.