Rader's FFT algorithm

Rader's algorithm (1968),[1] named for Charles M. Rader of MIT Lincoln Laboratory, is a fast Fourier transform (FFT) algorithm that computes the discrete Fourier transform (DFT) of prime sizes by re-expressing the DFT as a cyclic convolution (the other algorithm for FFTs of prime sizes, Bluestein's algorithm, also works by rewriting the DFT as a convolution).

The algorithm can be modified to gain a factor of two savings for the case of DFTs of real data, using a slightly modified re-indexing/permutation to obtain two half-size cyclic convolutions of real data;[2] an alternative adaptation for DFTs of real data uses the discrete Hartley transform.

[3] Winograd extended Rader's algorithm to include prime-power DFT sizes

However, for composite sizes such as prime powers, the Cooley–Tukey FFT algorithm is much simpler and more practical to implement, so Rader's algorithm is typically only used for large-prime base cases of Cooley–Tukey's recursive decomposition of the DFT.

[3] Begin with the definition of the discrete Fourier transform: If N is a prime number, then the set of non-zero indices

That means that we can rewrite the DFT using these new indices p and q as: (Recall that xn and Xk are implicitly periodic in N, and also that

The final summation, above, is precisely a cyclic convolution of the two sequences aq and bq (of length N–1, because

However, that may not be efficient if N–1 itself has large prime factors, requiring recursive use of Rader's algorithm.

Instead, one can compute a length-(N–1) cyclic convolution exactly by zero-padding it to a length of at least 2(N–1)–1, say to a power of two, which can then be evaluated in O(N log N) time without the recursive application of Rader's algorithm.

This algorithm, then, requires O(N) additions plus O(N log N) time for the convolution.

In practice, the O(N) additions can often be performed by absorbing the additions into the convolution: if the convolution is performed by a pair of FFTs, then the sum of xn is given by the DC (0th) output of the FFT of aq plus x0, and x0 can be added to all the outputs by adding it to the DC term of the convolution prior to the inverse FFT.

Still, this algorithm requires intrinsically more operations than FFTs of nearby composite sizes, and typically takes 3–10 times as long in practice.

If Rader's algorithm is performed by using FFTs of size N–1 to compute the convolution, rather than by zero padding as mentioned above, the efficiency depends strongly upon N and the number of times that Rader's algorithm must be applied recursively.