In graph theory, this means finding a set of k vertices for which the largest distance of any point to its closest vertex in the k-set is minimum.
The vertices must be in a metric space, providing a complete graph that satisfies the triangle inequality.
[4] The k-Center Clustering problem can also be defined on a complete undirected graph G = (V, E) as follows: Given a complete undirected graph G = (V, E) with distances d(vi, vj) ∈ N satisfying the triangle inequality, find a subset C ⊆ V with |C| = k while minimizing: In a complete undirected graph G = (V, E), if we sort the edges in non-decreasing order of the distances: d(e1) ≤ d(e2) ≤ ... ≤ d(em) and let Gi = (V, Ei), where Ei = {e1, e2, ..., ei}.
The k-center problem is equivalent to finding the smallest index i such that Gi has a dominating set of size at most k. [5] Although Dominating Set is NP-complete, the k-center problem remains NP-hard.
This is clear, since the optimality of a given feasible solution for the k-center problem can be determined through the Dominating Set reduction only if we know in first place the size of the optimal solution (i.e. the smallest index i such that Gi has a dominating set of size at most k), which is precisely the difficult core of the NP-Hard problems.
[4] Another algorithm with the same approximation factor takes advantage of the fact that the k-Center problem is equivalent to finding the smallest index i such that Gi has a dominating set of size at most k and computes a maximal independent set of Gi, looking for the smallest index i that has a maximal independent set with a size of at least k. [7] It is not possible to find an approximation algorithm with an approximation factor of 2 − ε for any ε > 0, unless P = NP.
[8] Furthermore, the distances of all edges in G must satisfy the triangle inequality if the k-center problem is to be approximated within any constant factor, unless P = NP.
[11] When considering the combined parameter given by k and the doubling dimension, k-Center is still W[1]-hard but it is possible to obtain a parameterized approximation scheme.
[12] This is even possible for the variant with vertex capacities, which bound how many vertices can be assigned to an opened center of the solution.
, the vertex k-center problem can not be (optimally) solved in polynomial time.
However, there are some polynomial time approximation algorithms that get near-optimal solutions.
the best possible solution that can be achieved by a polynomial time algorithm is a 2-approximated one.
[15][16] Even though these algorithms are the (polynomial) best possible ones, their performance on most benchmark datasets is very deficient.
Contrary to common sense, one of the most practical (polynomial) heuristics for the vertex k-center problem is based on the CDS algorithm, which is a 3-approximation algorithm[19] Formally characterized by David Shmoys in 1995,[18] the Sh algorithm takes as input a complete undirected graph
The Sh algorithm works as follows: selects the first center
However, by running a binary search over the ordered set of edge costs, its complexity is reduced to
Proposed independently by Teofilo Gonzalez,[15] and by Martin Dyer and Alan Frieze[16] in 1985, the Gon algorithm is basically a more powerful version of the Sh algorithm.
, the Gon algorithm prescinds from such guess by noticing that if any set of vertices at distance greater than
Therefore, instead of computing at each iteration the set of vertices at distance greater than
However, by performing a binary search over the ordered set of edge costs, a more efficiente heuristic named CDSh is proposed.
[21] When considering the combined parameter given by k and the doubling dimension, k-Center is still W[1]-hard but it is possible to obtain a parameterized approximation scheme.
[22] This is even possible for the variant with vertex capacities, which bound how many vertices can be assigned to an opened center of the solution.
[25] Table 1 shows the mean and standard deviation of the experimental approximation factors of the solutions generated by each algorithm over the 40 pmed instances from OR-Lib[19] The greedy pure algorithm (or Gr) follows the core idea of greedy algorithms: to take optimal local decisions.
In the case of the vertex k-center problem, the optimal local decision consists in selecting each center in such a way that the size of the solution (covering radius) is minimum at each iteration.
[26] The empirical performance of the Gr algorithm is poor on most benchmark instances.
The Scoring algorithm (or Scr) was introduced by Jurij Mihelič and Borut Robič in 2005.
The problem is solved by pruning the input graph with every possible value of the optimal solution size and then solving the minimum dominating set problem heuristically.
This heuristic follows the lazy principle, which takes every decision as slow as possible (opossed to the greedy strategy).
The empirical performance of the Scr algorithm is very good on most benchmark instances.
However, its running time rapidly becomes impractical as the input grows.