Star height problem

[1] The first question was answered in the negative when in 1963, Eggan gave examples of regular languages of star height n for every n. Here, the star height h(L) of a regular language L is defined as the minimum star height among all regular expressions representing L. The first few languages found by Eggan are described in the following, by means of giving a regular expression for each language: The construction principle for these expressions is that expression

, appropriately renaming the letters of the second copy using fresh alphabet symbols, concatenating the result with another fresh alphabet symbol, and then by surrounding the resulting expression with a Kleene star.

there is no equivalent regular expression of star height less than n; a proof is given in Eggan (1963).

[2] Their examples can be described by an inductively defined family of regular expressions over the binary alphabet

does not admit an equivalent regular expression of lower star height.

[4] But the general problem remained open for more than 25 years until it was settled by Hashiguchi, who in 1988 published an algorithm to determine the star height of any regular language.

To illustrate the immense resource consumptions of that algorithm, Lombardy & Sakarovitch (2002) give some actual numbers: [The procedure described by Hashiguchi] leads to computations that are by far impossible, even for very small examples.

For instance, if L is accepted by a 4 state automaton of loop complexity 3 (and with a small 10 element transition monoid), then a very low minorant of the number of languages to be tested with L for equality is:

has 10 billion zeros when written down in decimal notation, and is already by far larger than the number of atoms in the observable universe.

A much more efficient algorithm than Hashiguchi's procedure was devised by Kirsten in 2005.

[6] This algorithm runs, for a given nondeterministic finite automaton as input, within double-exponential space.

Yet the resource requirements of this algorithm still greatly exceed the margins of what is considered practically feasible.

This algorithm has been optimized and generalized to trees by Colcombet and Löding in 2008,[7] as part of the theory of regular cost functions.