Deterministic finite automaton

In search of the simplest models to capture finite-state machines, Warren McCulloch and Walter Pitts were among the first researchers to introduce a concept similar to finite automata in 1943.

[2][3] The figure illustrates a deterministic finite automaton using a state diagram.

Upon reading a symbol, a DFA jumps deterministically from one state to another by following the transition arrow.

A DFA has a start state (denoted graphically by an arrow coming in from nowhere) where computations begin, and a set of accept states (denoted graphically by a double circle) which help define when a computation is successful.

A DFA is defined as an abstract mathematical concept, but is often implemented in hardware and software for solving various specific problems such as lexical analysis and pattern matching.

For example, a DFA can model software that decides whether or not online user input such as email addresses are syntactically valid.

[4] DFAs have been generalized to nondeterministic finite automata (NFA) which may have several arrows of the same label starting from a state.

Using the powerset construction method, every NFA can be translated to a DFA that recognizes the same language.

The following example is of a DFA M, with a binary alphabet, which requires that the input contains an even number of 0s.

According to the above definition, deterministic finite automata are always complete: they define from each state a transition for each input symbol.

While this is the most common definition, some authors use the term deterministic finite automaton for a slightly different notion: an automaton that defines at most one transition for each state and each input symbol; the transition function is allowed to be partial.

A local automaton is a DFA, not necessarily complete, for which all edges with the same label lead to a single vertex.

It is known that when k ≥ 2 is a fixed integer, with high probability, the largest strongly connected component (SCC) in such a k-out digraph chosen uniformly at random is of linear size and it can be reached by all vertices.

[9][11] This is also true for the largest induced sub-digraph of minimum in-degree one, which can be seen as a directed version of 1-core.

A run of a given DFA can be seen as a sequence of compositions of a very general formulation of the transition function with itself.

DFAs are one of the most practical models of computation, since there is a trivial linear time, constant-space, online algorithm to simulate a DFA on a stream of input.

The Equality, Inclusion and Minimization Problems are also PSPACE complete since they require forming the complement of an NFA which results in an exponential blow up of size.

[17] On the other hand, finite-state automata are of strictly limited power in the languages they can recognize; many simple languages, including any problem that requires more than constant space to solve, cannot be recognized by a DFA.

Intuitively, no DFA can recognize the Dyck language because DFAs are not capable of counting: a DFA-like automaton needs to have a state to represent any possible number of "currently open" parentheses, meaning it would need an unbounded number of states.

[19] The first algorithm for minimal DFA identification has been proposed by Trakhtenbrot and Barzdin[20] and is called the TB-algorithm.

In his work[19] E.M. Gold also proposed a heuristic algorithm for minimal DFA identification.

contain a characteristic set of the regular language; otherwise, the constructed DFA will be inconsistent either with

[24] Another research direction is the application of evolutionary algorithms: the smart state labeling evolutionary algorithm[25] allowed to solve a modified DFA identification problem in which the training data (sets

Yet another step forward is due to application of SAT solvers by Marjin J. H. Heule and S. Verwer: the minimal DFA identification problem is reduced to deciding the satisfiability of a Boolean formula.

[26] The main idea is to build an augmented prefix-tree acceptor (a trie containing all input words with corresponding labels) based on the input sets and reduce the problem of finding a DFA with

Though this approach allows finding the minimal DFA, it suffers from exponential blow-up of execution time when the size of input data increases.

[27] This allows reducing the search space of the problem, but leads to loss of the minimality guarantee.

Another way of reducing the search space has been proposed by Ulyantsev et al.[28] by means of new symmetry breaking predicates based on the breadth-first search algorithm: the sought DFA's states are constrained to be numbered according to the BFS algorithm launched from the initial state.

[29] The definition based on a singly infinite tape is a 7-tuple where The machine always accepts a regular language.

There must exist at least one element of the set F (a HALT state) for the language to be nonempty.