Montgomery modular multiplication

The algorithm uses the Montgomery forms of a and b to efficiently compute the Montgomery form of ab mod N. The efficiency comes from avoiding expensive division operations.

Classical modular multiplication reduces the double-width product ab using division by N and keeping only the remainder.

However, when performing many multiplications in a row, as in modular exponentiation, intermediate results can be left in Montgomery form.

Many important cryptosystems such as RSA and Diffie–Hellman key exchange are based on arithmetic operations modulo a large odd number, and for these cryptosystems, computations using Montgomery multiplication with R a power of two are faster than the available alternatives.

The quotient ring Z/NZ consists of residue classes modulo N, that is, its elements are sets of the form where a ranges across the integers.

Each residue class is a set of integers such that the difference of any two integers in the set is divisible by N (and the residue class is maximal with respect to that property; integers aren't left out of the residue class unless they would violate the divisibility condition).

The output of the integer operation determines a residue class, and the output of the modular operation is determined by computing the residue class's representative.

Mathematically, the integer between 0 and N − 1 that is congruent to ab can be expressed by applying the Euclidean division theorem: where q is the quotient

Montgomery form is a different way of expressing the elements of the ring in which modular products can be computed without expensive divisions.

For computational purposes it is also necessary that division and reduction modulo R are inexpensive, and the modulus is not useful for modular multiplication unless R > N. The Montgomery form of the residue class a with respect to R is aR mod N, that is, it is the representative of the residue class aR.

Addition and subtraction in Montgomery form are the same as ordinary modular addition and subtraction because of the distributive law: Note that doing the operation in Montgomery form does not lose information compared to doing it in the quotient ring Z/NZ.

This is a consequence of the fact that, because gcd(R, N) = 1, multiplication by R is an isomorphism on the additive group Z/NZ.

Since 12 is not divisible by 100, additional effort is required to remove the extra factor of R. Removing the extra factor of R can be done by multiplying by an integer R′ such that RR′ ≡ 1 (mod N), that is, by an R′ whose residue class is the modular inverse of R mod N. Then, working modulo N, The integer R′ exists because of the assumption that R and N are coprime.

The extended Euclidean algorithm efficiently determines integers R′ and N′ that satisfy Bézout's identity: 0 < R′ < N, 0 < N′ < R, and: This shows that it is possible to do multiplication in Montgomery form.

While the above algorithm is correct, it is slower than multiplication in the standard representation because of the need to multiply by R′ and divide by N. Montgomery reduction, also known as REDC, is an algorithm that simultaneously computes the product by R′ and reduces modulo N more quickly than the naïve method.

Unlike conventional modular reduction, which focuses on making the number smaller than N, Montgomery reduction focuses on making the number more divisible by R. It does this by adding a small multiple of N which is sophisticatedly chosen to cancel the residue modulo R. Dividing the result by R yields a much smaller number.

To use REDC to compute the product of 7 and 15 modulo 17, first convert to Montgomery form and multiply as integers to get 12 as above.

t is set to 11, which is less than 17, so the final result is 11, which agrees with the computation of the previous section.

Many operations of interest modulo N can be expressed equally well in Montgomery form.

Addition, subtraction, negation, comparison for equality, multiplication by an integer not in Montgomery form, and greatest common divisors with N may all be done with the standard algorithms.

This assumption implies that the product of two representatives mod N is less than RN, the exact hypothesis necessary for REDC to generate correct output.

Conversion out of Montgomery form is done by computing REDC(aR mod N).

Modular exponentiation can be done using exponentiation by squaring by initializing the initial product to the Montgomery representation of 1, that is, to R mod N, and by replacing the multiply and square steps by Montgomery multiplies.

When standalone REDC is needed, it can be computed as REDC of a product with 1 mod N. The only place where a direct reduction modulo N is necessary is in the precomputation of R2 mod N. Most cryptographic applications require numbers that are hundreds or even thousands of bits long.

However, when R is a power of B, there is a variant of REDC which requires products only of machine word sized integers.

The algorithm begins with a multiprecision integer T and reduces it one word at a time.

Then mNBi is added to T. Because this quantity is zero mod N, adding it does not affect the value of T mod N. If mi denotes the value of m computed in the ith iteration of the loop, then the algorithm sets S to T + (∑ mi Bi)N. Because MultiPrecisionREDC and REDC produce the same output, this sum is the same as the choice of m that the REDC algorithm would make.

Furthermore, because each step of MultiPrecisionREDC requires knowing only the lowest bit, Montgomery multiplication can be easily combined with a carry-save adder.

Because Montgomery reduction avoids the correction steps required in conventional division when quotient digit estimates are inaccurate, it is mostly free of the conditional branches which are the primary targets of timing and power side-channel attacks; the sequence of instructions executed is independent of the input operand values.

[4] It is of course necessary to ensure that the exponentiation algorithm built around the multiplication primitive is also resistant.