Quadratic sieve

It is still the fastest for integers under 100 decimal digits or so, and is considerably simpler than the number field sieve.

It was invented by Carl Pomerance in 1981 as an improvement to Schroeppel's linear sieve.

[1] The algorithm attempts to set up a congruence of squares modulo n (the integer to be factorized), which often leads to a factorization of n. The algorithm works in two phases: the data collection phase, where it collects information that may lead to a congruence of squares; and the data processing phase, where it puts all the data it has collected into a matrix and solves it to obtain a congruence of squares.

The data collection phase can be easily parallelized to many processors, but the data processing phase requires large amounts of memory, and is difficult to parallelize efficiently over many nodes or if the processing nodes do not each have enough memory to store the whole matrix.

The block Wiedemann algorithm can be used in the case of a few systems each capable of holding the matrix.

The general running time required for the quadratic sieve (to factor an integer n) is conjectured to be in the L-notation.

To factorize the integer n, Fermat's method entails a search for a single number a, n1/2 < a < n−1, such that the remainder of a2 divided by n is a square.

The quadratic sieve consists of computing the remainder of a2/n for several a, then finding a subset of these whose product is a square.

We can then factor using the Euclidean algorithm to calculate the greatest common divisor.

So the problem has now been reduced to: given a set of integers, find a subset whose product is a square.

By the fundamental theorem of arithmetic, any positive integer can be written uniquely as a product of prime powers.

can be regarded as the Galois field of order 2, that is we can divide by all non-zero numbers (there is only one, namely 1) when calculating modulo 2.

They are harder to find, but using only smooth numbers keeps the vectors and matrices smaller and more tractable.

The quadratic sieve attempts to find pairs of integers x and y(x) (where y(x) is a function of x) satisfying a much weaker condition than x2 ≡ y2 (mod n).

The quadratic sieve speeds up the process of finding relations by taking x close to the square root of n. This ensures that y(x) will be smaller, and thus have a greater chance of being smooth.

However, it is necessary to find at least one smooth relation more than the number of primes in the factor base, to ensure the existence of a linear dependency.

Sometimes, forming a cycle from two partial relations leads directly to a congruence of squares, but rarely.

The most obvious is by trial division, although this increases the running time for the data collection phase.

The sieve starts by setting every entry in a large array A[] of bytes to zero.

For each p, solve the quadratic equation mod p to get two roots α and β, and then add an approximation to log(p) to every entry for which y(x) = 0 mod p ... that is, A[kp + α] and A[kp + β].

This example will demonstrate standard quadratic sieve without logarithm optimizations or prime powers.

, there will be 2 resulting linear equations due to there being 2 modular square roots.

, the remainder of the algorithm follows equivalently to any other variation of Dixon's factorization method.

Trial division or Pollard rho could have found a factor with much less computation.

In effect, it can be added to the factor base, by sorting the list of relations into order by cofactor.

This works by reducing the threshold of entries in the sieving array above which a full factorization is performed.

To illustrate typical parameter choices for a realistic example on a real implementation including the multiple polynomial and large prime optimizations, the tool msieve was run on a 267-bit semiprime, producing the following parameters: Until the discovery of the number field sieve (NFS), QS was the asymptotically fastest known general-purpose factoring algorithm.

Now, Lenstra elliptic curve factorization has the same asymptotic running time as QS (in the case where n has exactly two prime factors of equal size), but in practice, QS is faster since it uses single-precision operations instead of the multi-precision operations used by the elliptic curve method.

The data collection phase took 5000 MIPS-years, done in distributed fashion over the Internet.

The data processing phase took 45 hours on Bellcore's (now Telcordia Technologies) MasPar (massively parallel) supercomputer.