In the theory of formal languages, the Myhill–Nerode theorem provides a necessary and sufficient condition for a language to be regular.
The theorem is named for John Myhill and Anil Nerode, who proved it at the University of Chicago in 1957 (Nerode & Sauer 1957, p. ii).
, define a distinguishing extension to be a string
is an equivalence relation on strings, and thus it divides the set of all strings into equivalence classes.
has a finite number of equivalence classes, and moreover, that this number is equal to the number of states in the minimal deterministic finite automaton (DFA) accepting
Furthermore, every minimal DFA for the language is isomorphic to the canonical one (Hopcroft & Ullman 1979).
has a finite number of equivalence classes.
(2) This number is equal to the number of states in the minimal deterministic finite automaton (DFA) accepting
(3) Any minimal DFA acceptor for the language is isomorphic to the following one: Generally, for any language, the constructed automaton is a state automaton acceptor.
The Myhill–Nerode theorem shows that finiteness is necessary and sufficient for language regularity.
is regular, construct a minimal DFA to accept it.
end up in the same state after running through the DFA, then
is at most the number of DFA states, which must be finite.
has a finite number of equivalence classes, then the state automaton constructed in the theorem is a DFA acceptor, thus the language is regular.
, we construct an isomorphism to the canonical one.
end up on the same state when running through
is equal to the number of equivalence classes of
Now this gives us a bijection between states of
and the states of the canonical acceptor.
It is clear that this bijection also preserves the transition rules, thus it is an isomorphism of DFA.
The Myhill–Nerode theorem may be used to show that a language
is regular by proving that the number of equivalence classes of
This may be done by an exhaustive case analysis in which, beginning from the empty string, distinguishing extensions are used to find additional equivalence classes until no more can be found.
For example, the language consisting of binary representations of numbers that can be divided by 3 is regular.
are distinguishing extensions resulting in the three classes (corresponding to numbers that give remainders 0, 1 and 2 when divided by 3), but after this step there is no distinguishing extension anymore.
The minimal automaton accepting our language would have three states corresponding to these three equivalence classes.
Another immediate corollary of the theorem is that if for a language
has infinitely many equivalence classes, it is not regular.
It is this corollary that is frequently used to prove that a language is not regular.
The Myhill–Nerode theorem can be generalized to tree automata.