Dynamic Markov compression

DMC has a good compression ratio and moderate speed, similar to PPM, but requires somewhat more memory and is not widely implemented.

Some recent implementations include the experimental compression programs hook by Nania Francesco Antonio, ocamyd by Frank Schwellinger, and as a submodel in paq8l by Matt Mahoney.

The predictor accepts an n-bit input string x = x1x2...xn and assigns it a probability p(x), expressed as a product of a series of predictions, p(x1)p(x2|x1)p(x3|x1x2) ... p(xn| x1x2...xn–1).

The arithmetic coder maintains two high precision binary numbers, plow and phigh, representing the possible range for the total probability that the model would assign to all strings lexicographically less than x, given the bits of x seen so far.

The compressed code for x is px, the shortest bit string representing a number between plow and phigh.

It is always possible to find a number in this range no more than one bit longer than the Shannon limit, log2 1 / p(x).

The arithmetic coder then divides the current range, (plow, phigh) into two parts in proportion to p0 and p1.

In the original DMC implementation, the initial table is the set of all contexts of length 8 to 15 bits that begin on a byte boundary.

The counts are floating point numbers initialized to a small nonzero constant such as 0.2.

The counts are not initialized to zero in order to allow values to be coded even if they have not been seen before in the current context.