Substring index

In computer science, a substring index is a data structure which gives substring search in a text or text collection in sublinear time.

But this is ambiguous, as it is also used for regular word indexes such as inverted files and document retrieval.

These data structures typically treat their text and pattern as strings over a fixed alphabet, and search for locations where the pattern occurs as a substring of the text.

The symbols of the alphabet may be characters (for instance in Unicode) but in practical applications for text retrieval it may be preferable to treat the (stemmed) words of a document as the symbols of its alphabet, because doing this reduces the lengths of both the text and pattern as measured in numbers of symbols.

[2] Specific data structures that can be used as substring indexes include: