List-labeling problem

In computer science, the list-labeling problem involves maintaining a totally ordered set S supporting the following operations: The cost of a list labeling algorithm is the number of label (re-)assignments per insertion or deletion.

List labeling algorithms have applications in many areas, including the order-maintenance problem, cache-oblivious data structures,[1] data structure persistence,[2] graph algorithms[3][4] and fault-tolerant data structures.

[5] Sometimes the list labeling problem is presented where S is not a set of values but rather a set of objects subject to a total order.

The solutions presented below apply to both formulations.

items are stored in the list-labeling structure at any time.

When this happens, all items are relabelled evenly from the space of all labels.

[6] The other cases of list labeling can be solved via balanced binary search trees.

, a binary search tree on S of height

is in the left subtree of the root, the high-order bit of

Suppose instead that, without loss of generality, the least common ancestor of

In order to use a self-balancing binary search tree to solve the list labeling problem, we need to first define the cost function of a balancing operation on insertion or deletion to equal the number of labels that are changed, since every rebalancing operation of the tree would have to also update all path labels in the subtree rooted at the site of the rebalance.

So, for example, rotating a node with a subtree of size

, which can be done in constant time under usual circumstances, requires

With that much time the entire tree could be rebuilt.

We will see below that there are self-balancing binary search tree data structures that cause an appropriate number of label updates during rebalancing.

A scapegoat tree is a weight-balanced tree where whenever a node no longer satisfies the weight-balance condition the entire subtree rooted at that node is rebuilt.

This rebalancing scheme is ideal for list labeling, since the cost of rebalancing now equals the cost of relabeling.

For the list labeling problem, the cost becomes: In the case where

list labeling cost for deterministic algorithms.

[6] Furthermore, the same lower bound holds for smooth algorithms, which are those whose only relabeling operation assigns labels evenly in a range of items[10] This lower bound is surprisingly strong in that it applies in the offline cases where all insertions and deletions are known ahead of time.

However, the best lower bound known for the linear case of algorithms that are allowed to be non-smooth and randomized is

[7][11] Substantial progress on closing the gap has been made by Bender et al. who gave a randomized upper bound of

[12] and subsequently gave a randomized amortized upper bound of

[13] The best known applications of list labeling are the order-maintenance problem and packed-memory arrays for cache-oblivious data structures.

The order-maintenance problem is that of maintaining a data structure on a linked list to answer order queries: given two items in the linked list, which is closer to the front of the list?

This problem can be solved directly by polynomial list labeling in

time per query, by assigning labels that are monotone with the rank in the list.

The time for insertions and deletions can be improved to constant time by combining exponential polynomial list labeling with exponential list labeling on small lists.

case of list labeling, by using the labels as addresses in the array, as long as the solution guarantees that the space between items is

The density bounds guarantee that a scan through the data is asymptotically optimal in the external-memory model for any block transfer size.