Terminology extraction

[1] In the semantic web era, a growing number of communities and networked enterprises started to access and interoperate through the internet.

Several methods to automatically extract technical terms from domain-specific document warehouses have been described in the literature.

[5][6][7][8][9][10][11][12][13][14][15][16][17] Typically, approaches to automatic term extraction make use of linguistic processors (part of speech tagging, phrase chunking) to extract terminological candidates, i.e. syntactically plausible terminological noun phrases.

[18] Terminological entries are then filtered from the candidate list using statistical and machine learning methods.

Combined with e.g. co-occurrence statistics, candidates for term translations can be obtained.