Data-oriented parsing

DOP was conceived by Remko Scha in 1990 with the aim of developing a performance-oriented grammar framework.

Unlike other probabilistic models, DOP takes into account all subtrees contained in a treebank rather than being restricted to, for example, 2-level subtrees (like PCFGs), thus allowing for more context-sensitive information.

The initial version developed by Rens Bod in 1992 was based on tree-substitution grammar,[2] while more recently, DOP has been combined with lexical-functional grammar (LFG).

The resulting DOP-LFG finds an application in machine translation.

Other work on learning and parameter estimation for DOP has also found its way into machine translation.