Ball tree

The resulting data structure has characteristics that make it useful for a number of applications, most notably nearest neighbor search.

Each node in the tree defines the smallest ball that contains all data points in its subtree.

fold, thus leading to a shallower tree structure, therefore need fewer distance computations, which usually yields faster queries.

[1] The goal of such an algorithm is to produce a tree that will efficiently support queries of the desired type (e.g. nearest-neighbor) in the average case.

The specific criteria of an ideal tree will depend on the type of question being answered and the distribution of the underlying data.

However, a generally applicable measure of an efficient tree is one that minimizes the total volume of its internal nodes.

The ball tree nearest-neighbor algorithm examines nodes in depth-first order, starting at the root.

During the search, the algorithm maintains a max-first priority queue (often implemented with a heap), denoted Q here, of the k nearest points encountered so far.

In comparison with several other data structures, ball trees have been shown to perform fairly well on the nearest-neighbor search problem, particularly as their number of dimensions grows.