Lexicographic order

The formal notion starts with a finite set A, often called the alphabet, which is totally ordered.

The lexicographical order on the set of all these finite words orders the words as follows: However, in combinatorics, another convention is frequently used for the second case, whereby a shorter sequence is always smaller than a longer sequence.

An important property of the lexicographical order is that for each n, the set of words of length n is well-ordered by the lexicographical order (provided the alphabet is finite); that is, every decreasing sequence of words of length n is finite (or equivalently, every non-empty subset has a least element).

[1][2] It is not true that the set of all finite words is well-ordered; for example, the infinite set of words {b, ab, aab, aaab, ... } has no lexicographically earliest element.

The lexicographical order is used not only in dictionaries, but also commonly for numbers and dates.

One of the drawbacks of the Roman numeral system is that it is not always immediately obvious which of two numbers is the smaller.

On the other hand, with the positional notation of the Hindu–Arabic numeral system, comparing numbers is easy, because the natural order on natural numbers is the same as the variant shortlex of the lexicographic order.

For real numbers written in decimal notation, a slightly different variant of the lexicographical order is used: the parts on the left of the decimal point are compared as before; if they are equal, the parts at the right of the decimal point are compared with the lexicographical order.

This is not usually a problem for humans, but it may be for computers (testing the sign takes some time).

This is one of the reasons for adopting two's complement representation for representing signed integers in computers.

This formatting scheme has the advantage that the lexicographical order on sequences of characters that represent dates coincides with the chronological order: an earlier CE date is smaller in the lexicographical order than a later date up to year 9999.

This date ordering makes computerized sorting of dates easier by avoiding the need for a separate sorting algorithm.

That is, the elements of the monoid are the finite sequences (words) of elements of A (including the empty sequence, of length 0), and the operation (multiplication) is the concatenation of words.

With this terminology, the above definition of the lexicographical order becomes more concise: Given a partially or totally ordered set A, and two words a and b over A such that b is non-empty, then one has a < b under lexicographical order, if at least one of the following conditions is satisfied: Notice that, due to the prefix condition in this definition,

For instance, if A = {a, b}, the language {anb | n ≥ 0, b > ε} has no least element in the lexicographical order: ... < aab < ab < b.

One can define similarly the lexicographic order on the Cartesian product of an infinite family of ordered sets, if the family is indexed by the natural numbers, or more generally by a well-ordered set.

In combinatorics, one has often to enumerate, and therefore to order the finite subsets of a given set

In this context, one generally prefer to sort first the subsets by cardinality, such as in the shortlex order.

linear forms with real coefficients, define a map from

linear forms with real coefficients, such that the induced map

has the following properties; The colexicographic or colex order is a variant of the lexicographical order that is obtained by reading finite sequences from the right to the left instead of reading them from the left to the right.

However, when considering increasing sequences, typically for coding subsets, the two orders differ significantly.

In other words, the colexicographical order for increasing sequences of a given length induces an order isomorphism with the natural numbers, and allows enumerating these sequences.

When considering polynomials, the order of the terms does not matter in general, as the addition is commutative.

However, some algorithms, such as polynomial long division, require the terms to be in a specific order.

Many of the main algorithms for multivariate polynomials are related with Gröbner bases, concept that requires the choice of a monomial order, that is a total order, which is compatible with the monoid structure of the monomials.

This compatibility implies that the product of a polynomial by a monomial does not change the order of the terms.

As Gröbner bases are defined for polynomials in a fixed number of variables, it is common to identify monomials (for example

Another one consists in comparing first the total degrees, and then resolving the conflicts by using the lexicographical order.

A useful property of the degree reverse lexicographical order is that a homogeneous polynomial is a multiple of the least indeterminate if and only if its leading monomial (its greater monomial) is a multiple of this least indeterminate.

Orderings of the 3- subsets of represented as sets of red squares, increasing sequences (in blue), or by their indicator functions , converted in decimal notation (in grey). The grey numbers are also the rank of the subsets in all subsets of numbered in colexicographical order, and starting from 0. The lexicographical (lex) and colexicographical (colex) orders are on the top and the corresponding reverse orders (rev) on the bottom
One passes from an order to its reverse order, either by reading bottom-up instead of up-bottom, or by exchanging red and white colors.
Orderings of the 24 permutations of {1,...,5} that are 5-cycles (in blue). The inversion vectors (in red) of permutations in colex order are in revcolex order, and vice versa.