Generalized vector space model

The Generalized vector space model is a generalization of the vector space model used in information retrieval.

Wong et al.[1] presented an analysis of the problems that the pairwise orthogonality assumption of the vector space model (VSM) creates.

From here they extended the VSM to the generalized vector space model (GVSM).

More specifically, the factor considered a new space, where each term vector ti was expressed as a linear combination of 2n vectors mr where r = 1...2n.

For a document dk and a query q the similarity function now becomes: where ti and tj are now vectors of a 2n dimensional space.

For an example, Wong et al. uses the term occurrence frequency matrix obtained from automatic indexing as input to their algorithm.

There are at least two basic directions for embedding term to term relatedness, other than exact keyword matching, into a retrieval model: Recently Tsatsaronis[2] focused on the first approach.

They measure semantic relatedness (SR) using a thesaurus (O) like WordNet.

where si and sj are senses of terms ti and tj respectively, maximizing

Building also on the first approach, Waitelonis et al.[3] have computed semantic relatedness from Linked Open Data resources including DBpedia as well as the YAGO taxonomy.