Word sketch

Word sketches were first introduced by the British corpus linguist Adam Kilgarriff[1] and exploited within the Sketch Engine[2] corpus management system.

They are an extension of the general collocation concept used in corpus linguistics in that they group collocations according to particular grammatical relations (e.g. subject, object, modifier etc.).

The collocation candidates in a word sketch are sorted either by their frequency or using a lexicographic association score like Dice, T-score or MI-score.

Since the introduction, word sketches have been used by lexicographers to develop modern corpus-based dictionaries by major publishing houses including Oxford English Dictionary,[3] Macmillan English Dictionary[1] and comprising dozens of languages including English,[1] Chinese,[4] Slovene,[5] Japanese,[6] Dutch,[7] Romanian,[8] Russian,[9] Czech,[10] Polish,[11] Vietnamese,[12] Turkish,[13] Portuguese,[14] Hindi,[15] Spanish[16] and others.

Considering an underlying text corpus, a word sketch quintuple is a quintuple consisting of headword, grammatical relation, collocation, position of headword in the corpus, position of collocation in the corpus (e.g. man, modifier, young, 104, 103).

Word sketch of verb "read" in the British National Corpus in Sketch Engine