Set cover problem

The set cover problem is a classical question in combinatorics, computer science, operations research, and complexity theory.

Given a set of elements {1, 2, …, n} (henceforth referred to as the universe, specifying all possible elements under consideration) and a collection, referred to as S, of a given m subsets whose union equals the universe, the set cover problem is to identify a smallest sub-collection of S whose union equals the universe.

Therefore, the solution to the set cover problem for this U and S has size 2.

The decision version of set covering is NP-complete.

[1] It is a problem "whose study has led to the development of fundamental techniques for the entire field" of approximation algorithms.

, such that for each element x in the universe, the sum of fractions of sets that contain x is at least 1.

The set cover problem can be formulated as the following integer linear program (ILP).

[3] For a more compact representation of the covering constraint, one can define an incidence matrix

Weighted set cover is described by a program identical to the one given above, except that the objective function to minimize is

Fractional set cover is described by a program identical to the one given above, except that

This linear program belongs to the more general class of LPs for covering problems, as all the coefficients in the objective function and both sides of the constraints are non-negative.

approximation algorithm for the minimum set cover problem.

[4] See randomized rounding#setcover for a detailed explanation.

That is seen by observing that an instance of set covering can be viewed as an arbitrary bipartite graph, with the universe represented by vertices on the left, the sets represented by vertices on the right, and edges representing the membership of elements to sets.

The task is then to find a minimum cardinality subset of left-vertices that has a non-trivial intersection with each of the right-vertices, which is precisely the hitting set problem.

This greedy algorithm actually achieves an approximation ratio of

[8] There is a standard example on which the greedy algorithm achieves an approximation ratio of

On this input, the greedy algorithm takes the sets

Inapproximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for set cover up to lower order terms (see Inapproximability results below), under plausible complexity assumptions.

A tighter analysis for the greedy algorithm shows that the approximation ratio is exactly

[9] If each element occurs in at most f sets, then a solution can be found in polynomial time that approximates the optimum to within a factor of f using LP relaxation.

refers to the size of the universe, Lund & Yannakakis (1994) showed that set covering cannot be approximated in polynomial time to within a factor of

under the same assumptions, which essentially matches the approximation ratio achieved by the greedy algorithm.

Raz & Safra (1997) established a lower bound of

was recently proved by Alon, Moshkovitz & Safra (2006).

Dinur & Steurer (2013) showed optimal inapproximability by proving that it cannot be approximated to

In low-frequency systems, Dinur et al. (2003) proved it is NP-hard to approximate set cover to better than

If the Unique games conjecture is true, this can be improved to

of the greedy algorithm essentially tight in this case.

Relaxing the integer linear program for weighted set cover stated above, one may use randomized rounding to get an

Example of an instance of set cover problem.

Tight example for the greedy algorithm with k=3