Approximation algorithm

In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned solution to the optimal one.

[1] Approximation algorithms naturally arise in the field of theoretical computer science as a consequence of the widely believed P ≠ NP conjecture.

Under this conjecture, a wide class of optimization problems cannot be solved exactly in polynomial time.

In an overwhelming majority of the cases, the guarantee of such algorithms is a multiplicative one expressed as an approximation ratio or approximation factor i.e., the optimal solution is always guaranteed to be within a (predetermined) multiplicative factor of the returned solution.

However, there are also many approximation algorithms that provide an additive guarantee on the quality of the returned solution.

The design and analysis of approximation algorithms crucially involves a mathematical proof certifying the quality of the returned solutions in the worst case.

[1] This distinguishes them from heuristics such as annealing or genetic algorithms, which find reasonably good solutions on some inputs, but provide no clear indication at the outset on when they may succeed or fail.

There is widespread interest in theoretical computer science to better understand the limits to which we can approximate certain famous optimization problems.

For example, one of the long-standing open questions in computer science is to determine whether there is an algorithm that outperforms the 2-approximation for the Steiner Forest problem by Agrawal et al.[3] The desire to understand hard optimization problems from the perspective of approximability is motivated by the discovery of surprising mathematical connections and broadly applicable techniques to design algorithms for hard optimization problems.

One well-known example of the former is the Goemans–Williamson algorithm for maximum cut, which solves a graph theoretic problem using high dimensional geometry.

[4] A simple example of an approximation algorithm is one for the minimum vertex cover problem, where the goal is to choose the smallest set of vertices such that every edge in the input graph contains at least one chosen vertex.

One way to find a vertex cover is to repeat the following process: find an uncovered edge, add both its endpoints to the cover, and remove all edges incident to either vertex from the graph.

Others are impossible to approximate within any constant, or even polynomial, factor unless P = NP, as in the case of the maximum clique problem.

Therefore, an important benefit of studying approximation algorithms is a fine-grained classification of the difficulty of various NP-hard problems beyond the one afforded by the theory of NP-completeness.

In other words, although NP-complete problems may be equivalent (under polynomial-time reductions) to each other from the perspective of exact solutions, the corresponding optimization problems behave very differently from the perspective of approximate solutions.

This is often the case for algorithms that work by solving a convex relaxation of the optimization problem on the given input.

Since the value of the relaxation is never larger than the size of the optimal vertex cover, this yields another 2-approximation algorithm.

Approximation algorithms as a research area is closely related to and informed by inapproximability theory where the non-existence of efficient algorithms with certain approximation ratios is proved (conditioned on widely believed hypotheses such as the P ≠ NP conjecture) by means of reductions.

In the case of the metric traveling salesman problem, the best known inapproximability result rules out algorithms with an approximation ratio less than 123/122 ≈ 1.008196 unless P = NP, Karpinski, Lampis, Schmied.

It is only since the 1990 result of Feige, Goldwasser, Lovász, Safra and Szegedy on the inapproximability of Independent Set[7] and the famous PCP theorem,[8] that modern tools for proving inapproximability results were uncovered.

The PCP theorem, for example, shows that Johnson's 1974 approximation algorithms for Max SAT, set cover, independent set and coloring all achieve the optimal approximation ratio, assuming P ≠ NP.

Implementation and running time issues aside, the guarantees provided by approximation algorithms may themselves not be strong enough to justify their consideration in practice.

In this way, the study of even very expensive algorithms is not a completely theoretical pursuit as they can yield valuable insights.

In other cases, even if the initial results are of purely theoretical interest, over time, with an improved understanding, the algorithms may be refined to become more practical.

One such example is the initial PTAS for Euclidean TSP by Sanjeev Arora (and independently by Joseph Mitchell) which had a prohibitive running time of

In this case, it's enough to construct an input instance designed to force the algorithm into a worst-case scenario.

For example, a ρ-approximation algorithm A is defined to be an algorithm for which it has been proven that the value/cost, f(x), of the approximate solution A(x) to an instance x will not be more (or less, depending on the situation) than a factor ρ times the value, OPT, of an optimum solution.

is the largest bound on the approximation ratio, r, that one sees over all possible instances of the problem.

is: That is to say that it is the same as the absolute performance ratio, with a lower bound n on the size of problem instances.

An example of this is the optimal inapproximability — inexistence of approximation — ratio of 7 / 8 + ϵ for satisfiable MAX-3SAT instances due to Johan Håstad.