Smoothed analysis

In theoretical computer science, smoothed analysis is a way of measuring the complexity of an algorithm.

Since its introduction in 2001, smoothed analysis has been used as a basis for considerable research, for problems ranging from mathematical programming, numerical analysis, machine learning, and data mining.

[1] It can give a more realistic analysis of the practical performance (e.g., running time, success rate, approximation quality) of the algorithm compared to analysis that uses worst-case or average-case scenarios.

Smoothed analysis is a hybrid of worst-case and average-case analyses that inherits advantages of both.

It measures the expected performance of algorithms under slight random perturbations of worst-case inputs.

If the smoothed complexity of an algorithm is low, then it is unlikely that the algorithm will take a long time to solve practical instances whose data are subject to slight noises and imprecisions.

Smoothed complexity results are strong probabilistic results, roughly stating that, in every large enough neighbourhood of the space of inputs, most inputs are easily solvable.

Thus, a low smoothed complexity means that the hardness of inputs is a "brittle" property.

Although worst-case complexity has been widely successful in explaining the practical performance of many algorithms, this style of analysis gives misleading results for a number of problems.

For example, the worst-case complexity of solving a linear program using the simplex algorithm is exponential,[2] although the observed number of steps in practice is roughly linear.

[3][4] The simplex algorithm is in fact much faster than the ellipsoid method in practice, although the latter has polynomial-time worst-case complexity.

However, the resulting average-case complexity depends heavily on the probability distribution that is chosen over the input.

Because of this choice of data model, a theoretical average-case result might say little about practical performance of the algorithm.

ACM and the European Association for Theoretical Computer Science awarded the 2008 Gödel Prize to Daniel Spielman and Shanghua Teng for developing smoothed analysis.

[1] In 2010 Spielman received the Nevanlinna Prize for developing smoothed analysis.

Spielman and Teng's JACM paper "Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time" was also one of the three winners of the 2009 Fulkerson Prize sponsored jointly by the Mathematical Programming Society (MPS) and the American Mathematical Society (AMS).

[3][4] Yet in the theoretical worst case it takes exponentially many steps for most successfully analyzed pivot rules.

has independent entries sampled from a Gaussian distribution with mean

then the smoothed complexity of the simplex method is[6] This bound holds for a specific pivot rule called the shadow vertex rule.

[8] A number of local search algorithms have bad worst-case running times but perform well in practice.

It can take exponentially many iterations until it finds a locally optimal solution, although in practice the running time is subquadratic in the number of vertices.

[10] The approximation ratio, which is the ratio between the length of the output of the algorithm and the length of the optimal solution, tends to be good in practice but can also be bad in the theoretical worst case.

Already in two dimensions, the 2-opt heuristic might take exponentially many iterations until finding a local optimum.

is big, the adversary has more ability to increase the likelihood of hard problem instances.

In this perturbation model, the expected number of iterations of the 2-opt heuristic, as well as the approximation ratios of resulting output, are bounded by polynomial functions of

[10] Another local search algorithm for which smoothed analysis was successful is the k-means method.

Lloyd's algorithm is widely used and very fast in practice, although it can take

iterations in the worst case to find a locally optimal solution.

However, assuming that the points have independent Gaussian distributions, each with expectation in

, the expected number of iterations of the algorithm is bounded by a polynomial in