Generalized suffix tree

time, which is asymptotically optimal (assuming the size of the alphabet is constant[2]: 119 ).

When constructing such a tree, each string should be padded with a unique out-of-alphabet marker symbol (or string) to ensure no suffix is a substring of another, guaranteeing each suffix is represented by a unique leaf node.

A suffix tree for the strings ABAB and BABA is shown in a figure above.

They are padded with the unique terminator strings $0 and $1.

Notice how a left to right traversal of the leaf nodes corresponds to the sorted order of the suffixes.

The terminators might be strings or unique single symbols.

An alternative to building a generalized suffix tree is to concatenate the strings, and build a regular suffix tree or suffix array for the resulting string.

When hits are evaluated after a search, global positions are mapped into documents and local positions with some algorithm and/or data structure, such as a binary search in the starting/ending positions of the documents.

Suffix tree for the strings ABAB and BABA . Suffix links not shown.