Brill tagger

This approach ensures that valuable information such as the morphosyntactic construction of words is employed in an automatic tagging process.

Then "patches" are determined via rules that correct (probable) tagging errors made in the initialization phase:[1] The input text is first tokenized, or broken into words.

Typically in natural language processing, contractions such as "'s", "n't", and the like are considered separate word tokens, as are punctuation marks.

Rules should only operate if the tag being changed is also known to be permissible, for the word in question or in principle (for example, most adjectives in English can also be used as nouns).

Typical Brill taggers use a few hundred rules, which may be developed by linguistic intuition or by machine learning on a pre-tagged corpus.