When efficiently implemented, it is fast enough that its benefits usually justify including it as an extra step in data compression algorithm.
We put a 1 to the output stream: The b moves to the front of the list, producing (bacdefghijklmnopqrstuvwxyz).
Continuing this way, we find that the sequence is encoded by: It is easy to see that the transform is reversible.
For encoding, no clear advantage is gained by using a linked list, so using an array to store the list is acceptable, with worst-case performance O(nk), where n is the length of the data to be encoded and k is the number of values (generally a constant for a given implementation).
The typical performance is better because frequently-used symbols are more likely to be at the front and will produce earlier hits.
If one rotates the dictionary to put the more-used characters in earlier places, a better encoding can be obtained: The MTF transform takes advantage of local correlation of frequencies to reduce the entropy of a message.
[clarification needed] Indeed, recently used letters stay towards the front of the list; if use of letters exhibits local correlations, this will result in a large number of small numbers such as "0"'s and "1"'s in the output.
However, not all data exhibits this type of local correlation, and for some messages, the MTF transform may actually increase the entropy.
The Burrows–Wheeler transform is very good at producing a sequence that exhibits local frequency correlation from text and certain other special classes of data.
The reason is that English text does not in general exhibit a high level of local frequency correlation.