Average-case complexity

[1] First, although some problems may be intractable in the worst-case, the inputs which elicit this behavior may rarely occur in practice, so the average-case complexity may be a more accurate measure of an algorithm's performance.

Second, average-case complexity analysis provides tools and techniques to generate hard instances of problems which can be utilized in areas such as cryptography and derandomization.

[2]: 28 The average-case performance of algorithms has been studied since modern notions of computational efficiency were developed in the 1950s.

[3] In 1973, Donald Knuth[4] published Volume 3 of the Art of Computer Programming which extensively surveys average-case performance of algorithms for problems solvable in worst-case polynomial time, such as sorting and median-finding.

An initial attempt might define an efficient average-case algorithm as one which runs in expected polynomial time over all possible inputs.

[3] To create a more robust definition of average-case efficiency, it makes sense to allow an algorithm A to run longer than polynomial time on some inputs but the fraction of inputs on which A requires larger and larger running time becomes smaller and smaller.

That is, an "average-case" problem consists of a language L and an associated probability distribution D which forms the pair (L, D).

[7] A distributional problem (L, D) is in the complexity class AvgP if there is an efficient average-case algorithm for L, as defined above.

[7] In his original paper, Levin showed an example of a distributional tiling problem that is average-case NP-complete.

[10] As mentioned above, much early work relating to average-case complexity focused on problems for which polynomial-time algorithms already existed, such as sorting.

Note that it is not desirable for the candidate function to be NP-complete since this would only guarantee that there is likely no efficient algorithm for solving the problem in the worst case; what we actually want is a guarantee that no efficient algorithm can solve the problem over random inputs (i.e. the average case).

In fact, both the integer factorization and discrete log problems are in NP ∩ coNP, and are therefore not believed to be NP-complete.

Yao's principle, from a 1978 paper by Andrew Yao, shows that for broad classes of computational problems, average-case complexity for a hard input distribution and a deterministic algorithm adapted to that distribution is the same thing as expected complexity for a fast randomized algorithm and its worst-case input.

[13] Applying this theory to natural distributional problems remains an outstanding open question.

Further, they show that this conclusion holds under a weaker assumption: if every language in NP is easy on average for decision algorithms with respect to the uniform distribution, then it is also easy on average for search algorithms with respect to the uniform distribution.

[14] Thus, cryptographic one-way functions can exist only if there are distNP problems over the uniform distribution that are hard on average for decision algorithms.