Optimal binary search tree

Optimal BSTs are generally divided into two types: static and dynamic.

In this case, there exists some particular layout of the nodes of the tree which provides the smallest expected search time for the given access probabilities.

Various algorithms exist to construct or approximate the statically optimal tree given the information on the access probabilities of the elements.

The tree is considered to have a cursor starting at the root which it can move or use to perform modifications.

In 1971, Knuth published a relatively straightforward dynamic programming algorithm capable of constructing the statically optimal tree in only O(n2) time.

[2] In this work, Knuth extended and improved the dynamic programming algorithm by Edgar Gilbert and Edward F. Moore introduced in 1958.

The tree with the minimal weighted path length is, by definition, statically optimal.

Observe that when either subtree is attached to the root, the depth of each of its elements (and thus each of its search paths) is increased by one.

This means that the difference in weighted path length between a tree and its two subtrees is exactly the sum of every single probability in the tree, leading to the following recurrence: This recurrence leads to a natural dynamic programming solution.

be the weighted path length of the statically optimal search tree for all values between ai and aj, let

In addition to its dynamic programming algorithm, Knuth proposed two heuristics (or rules) to produce nearly (approximation of) optimal binary search trees.

Studying nearly optimal binary search trees was necessary since Knuth's algorithm time and space complexity can be prohibitive when

The analysis on how far from the optimum Knuth's heuristics can be was further proposed by Kurt Mehlhorn.

In 1975, Kurt Mehlhorn published a paper proving important properties regarding Knuth's rules.

Mehlhorn's major results state that only one of Knuth's heuristics (Rule II) always produces nearly optimal binary search trees.

On the other hand, the root-max rule could often lead to very "bad" search trees based on the following simple argument.

[6] In addition, Mehlhorn improved Knuth's work and introduced a much simpler algorithm that uses Rule II and closely approximates the performance of the statically optimal tree in only ⁠

[6] The algorithm follows the same idea of the bisection rule by choosing the tree's root to balance the total weight (by probability) of the left and right subtrees most closely.

That this strategy produces a good approximation can be seen intuitively by noting that the weights of the subtrees along any path form something very close to a geometrically decreasing sequence.

In fact, this strategy generates a tree whose weighted path length is at most where H is the entropy of the probability distribution.

Since no optimal binary search tree can ever do better than a weighted path length of this approximation is very close.

A later simplification by Garsia and Wachs, the Garsia–Wachs algorithm, performs the same comparisons in the same order.

There are several different definitions of dynamic optimality, all of which are effectively equivalent to within a constant factor in terms of running-time.

[8] The problem was first introduced implicitly by Sleator and Tarjan in their paper on splay trees,[9] but Demaine et al. give a very good formal statement of it.

For each access, our BST algorithm may perform any sequence of the above operations as long as the pointer eventually ends up on the node containing the target value xi.

While it is impossible to implement this "God's algorithm" without foreknowledge of exactly what the access sequence will be, we can define OPT(X) as the number of operations it would perform for an access sequence X, and we can say that an algorithm is dynamically optimal if, for any X, it performs X in time O(OPT(X)) (that is, it has a constant competitive ratio).

It is an open problem whether there exists a dynamically optimal data structure in this model.

That is, a splay tree is believed to perform any sufficiently long access sequence X in time O(OPT(X)).

[9] The tango tree is a data structure proposed in 2004 by Erik D. Demaine, Dion Harmon, John Iacono, and Mihai Pătrașcu which has been proven to perform any sufficiently-long access sequence X in time

is still very small for reasonable values of n.[8] In 2013, John Iacono published a paper which uses the geometry of binary search trees to provide an algorithm which is dynamically optimal if any binary search tree algorithm is dynamically optimal.