Distributional semantics

The distributional hypothesis in linguistics is derived from the semantic theory of language usage, i.e. words that are used and occur in the same contexts tend to purport similar meanings.

Although the Distributional Hypothesis originated in linguistics,[4][5] it is now receiving attention in cognitive science especially regarding the context of word use.

[10] Different kinds of similarities can be extracted depending on which type of distributional information is used to collect the vectors: topical similarities can be extracted by populating the vectors with information on which text regions the linguistic items occur in; paradigmatic similarities can be extracted by populating the vectors with information on which other linguistic items the items co-occur with.

The basic idea of a correlation between distributional and semantic similarity can be operationalized in many different ways.

This work was originally proposed by Stephen Clark, Bob Coecke, and Mehrnoosh Sadrzadeh of Oxford University in their 2008 paper, "A Compositional Distributional Model of Meaning".

How words are related in a given language is demonstrated in the "semantic space", which mathematically corresponds to the vector space.