Join-based tree algorithms

In computer science, join-based tree algorithms are a class of algorithms for self-balancing binary search trees.

This framework aims at designing highly-parallelized algorithms for various balanced binary search trees.

The algorithmic framework is based on a single operation join.

[1] Under this framework, the join operation captures all balancing criteria of different balancing schemes, and all other functions join have generic implementation across different balancing schemes.

operation takes as input two binary balanced trees

The join operation was first defined by Tarjan[2] on red–black trees, which runs in worst-case logarithmic time.

Later Sleator and Tarjan [3] described a join algorithm for splay trees which runs in amortized logarithmic time.

Later Adams [4] extended join to weight-balanced trees and used it for fast set–set functions including union, intersection and set difference.

In 1998, Blelloch and Reid-Miller extended join on treaps, and proved the bound of the set functions to be

They also brought up parallelism in Adams' algorithm by using a divide-and-conquer scheme.

In 2016, Blelloch et al. formally proposed the join-based algorithms, and formalized the join algorithm for four different balancing schemes: AVL trees, red–black trees, weight-balanced trees and treaps.

In the same work they proved that Adams' algorithms on union, intersection and difference are work-optimal on all the four balancing schemes.

considers rebalancing the tree, and thus depends on the input balancing scheme.

If the two trees are balanced, join simply creates a new node with left subtree t1, root k and right subtree t2.

Join follows the right spine of t1 until a node c which is balanced with t2.

At this point a new node with left child c, root k and right child t2 is created to replace c. The new node may invalidate the balancing invariant.

The following is the join algorithms on different balancing schemes.

creates a node with left child

By applying Join, all the subtrees on the left side are merged bottom-up using keys on the path as intermediate nodes from bottom to top to form the left tree, and the right part is asymmetric.

For some applications, Split also returns a boolean value denoting if x appears in the tree.

The split algorithm is as follows: This function is defined similarly as join but without the middle key.

The insertion and deletion algorithms, when making use of join can be independent of balancing schemes.

The following recursive function computes this union: Similarly, the algorithms of intersection and set-difference are as follows: The complexity of each of union, intersection and difference is

This complexity is optimal in terms of the number of comparisons.

More importantly, since the recursive calls to union, intersection or difference are independent of each other, they can be executed in parallel with a parallel depth

, the join-based implementation applies the same computation as in a single-element insertion or deletion if the root of the larger tree is used to split the smaller tree.

depth assuming the sorting algorithm has

This function selects all entries in a tree satisfying a predicate

, and return a tree containing all selected entries.

The join-based algorithms are applied to support interface for sets, maps, and augmented maps [5] in libraries such as Hackage, SML/NJ, and PAM.