Divide-and-conquer algorithm

A divide-and-conquer algorithm recursively breaks down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly.

The divide-and-conquer technique is the basis of efficient algorithms for many problems, such as sorting (e.g., quicksort, merge sort), multiplying large numbers (e.g., the Karatsuba algorithm), finding the closest pair of points, syntactic analysis (e.g., top-down parsers), and computing the discrete Fourier transform (FFT).

As in mathematical induction, it is often necessary to generalize the problem to make it amenable to a recursive solution.

The correctness of a divide-and-conquer algorithm is usually proved by mathematical induction, and its computational cost is often determined by solving recurrence relations.

Early examples of these algorithms are primarily decrease and conquer – the original problem is successively broken down into single subproblems, and indeed can be solved iteratively.

Binary search, a decrease-and-conquer algorithm where the subproblems are of roughly half the original size, has a long history.

While a clear description of the algorithm on computers appeared in 1946 in an article by John Mauchly, the idea of using a sorted list of items to facilitate searching dates back at least as far as Babylonia in 200 BC.

[5] Another ancient decrease-and-conquer algorithm is the Euclidean algorithm to compute the greatest common divisor of two numbers by reducing the numbers to smaller and smaller equivalent subproblems, which dates to several centuries BC.

An early example of a divide-and-conquer algorithm with multiple subproblems is Gauss's 1805 description of what is now called the Cooley–Tukey fast Fourier transform (FFT) algorithm,[6] although he did not analyze its operation count quantitatively, and FFTs did not become widespread until they were rediscovered over a century later.

[7] Another notable example is the algorithm invented by Anatolii A. Karatsuba in 1960[8] that could multiply two n-digit numbers in

As another example of a divide-and-conquer algorithm that did not originally involve computers, Donald Knuth gives the method a post office typically uses to route mail: letters are sorted into separate bags for different geographical areas, each of these bags is itself sorted into batches for smaller sub-regions, and so on until they are delivered.

Similarly, decrease and conquer only requires reducing the problem to a single smaller problem, such as the classic Tower of Hanoi puzzle, which reduces moving a tower of height

For example, when a) the work of splitting the problem and combining the partial solutions take

Then, the running times are as follows: If, instead, the work of splitting the problem and combining the parital solutions take

[9] Divide-and-conquer algorithms are naturally adapted for execution in multi-processor machines, especially shared-memory systems where the communication of data between processors does not need to be planned in advance because distinct sub-problems can be executed on different processors.

Divide-and-conquer algorithms naturally tend to make efficient use of memory caches.

In computations with rounded arithmetic, e.g. with floating-point numbers, a divide-and-conquer algorithm may yield more accurate results than a superficially equivalent iterative method.

While the second method performs the same number of additions as the first and pays the overhead of the recursive calls, it is usually more accurate.

In that case, the partial sub-problems leading to the one currently being solved are automatically stored in the procedure call stack.

This approach allows more freedom in the choice of the sub-problem that is to be solved next, a feature that is important in some applications — e.g. in breadth-first recursion and the branch-and-bound method for function optimization.

This approach is also the standard solution in programming languages that do not provide support for recursive procedures.

Compilers may also save more information in the recursion stack than is strictly necessary, such as return address, unchanging parameters, and the internal variables of the procedure.

For example, a Fast Fourier Transform algorithm could stop the recursion when the input is a single sample, and the quicksort list-sorting algorithm could stop when the input is the empty list; in both examples, there is only one base case to consider, and it requires no processing.

On the other hand, efficiency often improves if the recursion is stopped at relatively large base cases, and these are solved non-recursively, resulting in a hybrid algorithm.

For example, in a tree, rather than recursing to a child node and then checking whether it is null, checking null before recursing; avoids half the function calls in some algorithms on binary trees.

Note that these considerations do not depend on whether recursion is implemented by the compiler or by an explicit stack.

Increasing the base cases to lists of size 2 or less will eliminate most of those do-nothing calls, and more generally a base case larger than 2 is typically used to reduce the fraction of time spent in function-call overhead or stack manipulation.

[12] Source-code generation methods may be used to produce the large number of separate base cases desirable to implement this strategy efficiently.

[12] The generalized version of this idea is known as recursion "unrolling" or "coarsening", and various techniques have been proposed for automating the procedure of enlarging the base case.

In such cases it may be worth identifying and saving the solutions to these overlapping subproblems, a technique which is commonly known as memoization.