Randomness extractor

A randomness extractor, often simply called an "extractor", is a function, which being applied to output from a weak entropy source, together with a short, uniformly random seed, generates a highly random output that appears independent from the source and uniformly distributed.

[1] Examples of weakly random sources include radioactive decay or thermal noise; the only restriction on possible sources is that there is no way they can be fully controlled, calculated or predicted, and that a lower bound on their entropy rate can be established.

Sometimes the term "bias" is used to denote a weakly random source's departure from uniformity, and in older literature, some extractors are called unbiasing algorithms,[2] as they take the randomness from a so-called "biased" source and output a distribution that appears unbiased.

The weakly random source will always be longer than the extractor's output, but an efficient extractor is one that lowers this ratio of lengths as much as possible, while simultaneously keeping the seed length low.

An extractor has some conceptual similarities with a pseudorandom generator (PRG), but the two concepts are not identical.

(When a PRG is based on the existence of hard-core predicates, one can think of the weakly random source as a set of truth tables of such predicates and prove that the output is statistically close to uniform.

[3]) However, the general PRG definition does not specify that a weakly random source must be used, and while in the case of an extractor, the output should be statistically close to uniform, in a PRG it is only required to be computationally indistinguishable from uniform, a somewhat weaker concept.

Using the probabilistic method, it can be shown that there exists a (k, ε)-extractor, i.e. that the construction is possible.

An explicit construction is needed, which is given as follows: Definition (Explicit Extractor): For functions k(n), ε(n), d(n), m(n) a family Ext = {Extn} of functions is an explicit (k, ε)-extractor, if Ext(x, y) can be computed in polynomial time (in its input length) and for every n, Extn is a (k(n), ε(n))-extractor.

By the probabilistic method, it can be shown that there exists a (k, ε)-extractor with seed length and output length A variant of the randomness extractor with weaker properties is the disperser.

[5] It is often necessary to generate secret and random keys from sources that are semi-secret or which may be compromised to some degree.

More specifically, when a strong extractor is used its output will appear be uniformly random, even to someone who sees part (but not all) of the source.

Exposure-Resilient cryptography takes into account that the fact that it is difficult to keep secret the initial exchange of data which often takes place during the initialization of an encryption application e.g., the sender of encrypted information has to provide the receivers with information which is required for decryption.

The following paragraphs define and establish an important relationship between two kinds of ERF--k-ERF and k-APRF--which are useful in Exposure-Resilient cryptography.

The goal is to construct an adaptive ERF whose output is highly random and uniformly distributed.

But a stronger condition is often needed in which every output occurs with almost uniform probability.

More specifically, any extractor having sufficiently small error and taking as input an oblivious, bit-fixing source is also an APRF and therefore also a k-ERF.

That this extractor fulfills the criteria of the lemma is trivially true as

gives which proves that there exists an explicit k-APRF extractor with the given properties.

Perhaps the earliest example is due to John von Neumann.

From the input stream, his extractor took bits, two at a time (first and second, then third and fourth, and so on).

More generally, it applies to any exchangeable sequence—it only relies on the fact that for any pair, 01 and 10 are equally likely: for independent trials, these have probabilities

To put it simply, because the bits are statistically independent and due to the commutative property of multiplication, it would follow that

[8] Another approach is to use the output of a chaos machine applied to the input stream.

Input bits are pushed to the machine, evolving orbits and trajectories in multiple dynamical systems.

Additionally, this scheme allows for increased complexity, quality, and security of the output stream, controlled by specifying three parameters: time cost, memory required, and secret key.

Note that while true chaotic systems are mathematically sound for 'amplifying' entropy, this is predicated on the availability of real numbers with an infinite precision.

When implemented in digital computers with finite precision number representation, as in chaos machines using IEEE 754 Floating-Point, the periodicity has been shown to fall far short of the full space for a given bit length.

[citation needed] Randomness extractors are used widely in cryptographic applications, whereby a cryptographic hash function is applied to a high-entropy, but non-uniform source, such as disk drive timing information or keyboard delays, to yield a uniformly random result.

[10] Randomness extraction is also used in some branches of computational complexity theory and in the construction of list-decodable error correcting codes.