Kernelization

In computer science, a kernelization is a technique for designing efficient algorithms that achieve their efficiency by a preprocessing stage in which inputs to the algorithm are replaced by a smaller input, called a "kernel".

The result of solving the problem on the kernel should either be the same as on the original input, or it should be easy to transform the output on the kernel to the desired output for the original problem.

Kernelization is often achieved by applying a set of reduction rules that cut away parts of the instance that are easy to handle.

In parameterized complexity theory, it is often possible to prove that a kernel with guaranteed bounds on the size of a kernel (as a function of some parameter associated to the problem) can be found in polynomial time.

However, the following reduction rules may be used to kernelize it: An algorithm that applies these rules repeatedly until no more reductions can be made necessarily terminates with a kernel that has at most

Thus, the vertex cover problem can be solved in time

Although this bound is fixed-parameter tractable, its dependence on the parameter is higher than might be desired.

One algorithm that achieves this improved bound exploits the half-integrality of the linear program relaxation of vertex cover due to Nemhauser and Trotter.

[2] Another kernelization algorithm achieving that bound is based on what is known as the crown reduction rule and uses alternating path arguments.

[3] The currently best known kernelization algorithm in terms of the number of vertices is due to Lampis (2011) and achieves

, unless P = NP, for such a kernel would lead to a polynomial-time algorithm for the NP-hard vertex cover problem.

However, much stronger bounds on the kernel size can be proven in this case: unless coNP

it is impossible in polynomial time to find kernels with

In the literature, there is no clear consensus on how kernelization should be formally defined and there are subtle differences in the uses of that expression.

In the notation of Downey & Fellows (1999), a parameterized problem is a subset

A problem is fixed-parameter tractable if and only if it is kernelizable and decidable.

That a kernelizable and decidable problem is fixed-parameter tractable can be seen from the definition above: First the kernelization algorithm, which runs in time

The kernel is then solved by the algorithm that proves that the problem is decidable.

The other direction, that a fixed-parameter tractable problem is kernelizable and decidable is a bit more involved.

Assume that the question is non-trivial, meaning that there is at least one instance that is in the language, called

; otherwise, replacing any instance by the empty string is a valid kernelization.

Assume also that the problem is fixed-parameter tractable, i.e., it has an algorithm that runs in at most

bound on the number of steps without terminating, then return

This size bound is computable, by the assumption from fixed-parameter tractability that

in the examples above is chosen as the size of the desired solution, this is not necessary.

This approach is fruitful for instances whose solution size is large, but for which some other complexity measure is bounded.

For example, the feedback vertex number of an undirected graph

is defined as the minimum cardinality of a set of vertices whose removal makes

The vertex cover problem parameterized by the feedback vertex number of the input graph has a polynomial kernelization:[9] There is a polynomial-time algorithm that, given a graph

The kernelization algorithm therefore guarantees that instances with a small feedback vertex number