Fractional cascading

Later authors introduced more complex forms of fractional cascading that allow the data structure to be maintained as the data changes by a sequence of discrete insertion and deletion events.

, The simplest solution to this searching problem is just to store each list separately.

A second solution allows faster queries at the expense of more space: we may merge all the

The fractional cascading solution is to store a new sequence of lists

in this merged list, we store two numbers: the position resulting from searching for

For the data above, this would give us the following lists: Suppose we wish to perform a query in this structure, for

More generally, for any data structure of this type, we perform a query by doing a binary search for

In our example, the fractionally cascaded lists have a total of 25 elements, less than twice that of the original input.

as may be seen by regrouping the contributions to the total size coming from the same input list

A query in this data structure consists of a path in the graph and a query value q; the data structure must determine the position of q in each of the ordered lists associated with the vertices of the path.

To handle queries of this type, for a graph in which each vertex has at most d incoming and at most d outgoing edges for some constant d, the lists associated with each vertex are augmented by a fraction of the items from each outgoing neighbor of the vertex; the fraction must be chosen to be smaller than 1/d, so that the total amount by which all lists are augmented remains linear in the input size.

In the simple example above, d = 1, and we augmented each list with a 1/2 fraction of the neighboring items.

A query in this data structure consists of a standard binary search in the augmented list associated with the first vertex of the query path, together with simpler searches at each successive vertex of the path.

If a 1/r fraction of items are used to augment the lists from each neighboring item, then each successive query result may be found within at most r steps of the position stored at the query result from the previous path vertex, and therefore may be found in constant time without having to perform a full binary search.

In dynamic fractional cascading, the list stored at each node of the catalog graph may change dynamically, by a sequence of updates in which list items are inserted and deleted.

Instead of storing the augmented lists in arrays, they should be stored as binary search trees, so that these changes can be handled efficiently while still allowing binary searches of the augmented lists.

Rather, a technique closely related to B-trees allows the selection of a fraction of data that is guaranteed to be smaller than 1/d, with the selected items guaranteed to be spaced a constant number of positions apart in the full list, and such that an insertion or deletion into the augmented list associated with a node causes changes to propagate to other nodes for a fraction of the operations that is less than 1/d.

In this way, the distribution of the data among the nodes satisfies the properties needed for the query algorithm to be fast, while guaranteeing that the average number of binary search tree operations per data insertion or deletion is constant.

Third, and most critically, the static fractional cascading data structure maintains, for each element x of the augmented list at each node of the catalog graph, the index of the result that would be obtained when searching for x among the input items from that node and among the augmented lists stored at neighboring nodes.

Typical applications of fractional cascading involve range search data structures in computational geometry.

The problem is to structure the points in such a way that a query of this type may be answered efficiently in terms of the intersection size h. One structure that can be used for this purpose is the convex layers of the input point set, a family of nested convex polygons consisting of the convex hull of the point set and the recursively-constructed convex layers of the remaining points.

Within a single layer, the points inside the query half-plane may be found by performing a binary search for the half-plane boundary line's slope among the sorted sequence of convex polygon edge slopes, leading to the polygon vertex that is inside the query half-plane and farthest from its boundary, and then sequentially searching along the polygon edges to find all other vertices inside the query half-plane.

The whole half-plane range reporting problem may be solved by repeating this search procedure starting from the outermost layer and continuing inwards until reaching a layer that is disjoint from the query halfspace.

Fractional cascading speeds up the successive binary searches among the sequences of polygon edge slopes in each layer, leading to a data structure for this problem with space O(n) and query time O(log n + h).

The data structure may be constructed in time O(n log n) by an algorithm of Chazelle (1985).

Another application of fractional cascading in geometric data structures concerns point location in a monotone subdivision, that is, a partition of the plane into polygons such that any vertical line intersects any polygon in at most two points.

As Edelsbrunner, Guibas & Stolfi (1986) showed, this problem can be solved by finding a sequence of polygonal paths that stretch from left to right across the subdivision, and binary searching for the lowest of these paths that is above the query point.

Thus, each point location query can be solved as an outer layer of binary search among the paths, each step of which itself performs a binary search among x coordinates of vertices.

Beyond computational geometry, Lakshman & Stiliadis (1998) and Buddhikot, Suri & Waldvogel (1999) apply fractional cascading in the design of data structures for fast packet filtering in internet routers.

Gao et al. (2004) use fractional cascading as a model for data distribution and retrieval in sensor networks.

The convex layers of a point set, part of an efficient fractionally cascaded data structure for half-plane range reporting.