Cell-probe model

In computer science, the cell-probe model is a model of computation similar to the random-access machine, except that all operations are free except memory access.

This model is useful for proving lower bounds of algorithms for data structure problems.

The model is intended for proving lower bounds on the complexity of data structure problems.

For example, an update may add an element to the structure, or remove one.

In both cases, the cell-probe complexity of the data structure is characterized by the number of memory cells accessed during preprocessing, query and (if relevant) update.

The cell probe complexity is a lower bound on the time complexity of the corresponding operations on a random-access machine, where memory transfers are part of the operations counted in measuring time.

Yao used it to give a minimum number of memory cell "probes" or accesses necessary to determine whether a given query datum exists within a table stored in memory.

In 1989, Fredman and Saks initiated the study of cell probe lower bounds for dynamic data-structure problems (i.e., involving updates and queries), and introduced the notation CPROBE(b) for the cell-probe model assuming that a memory cell (word) consists of b bits.

[4] Yao[3] considered a static data-structure problem where one has to build a data structure ("table") to represent a set

Yao showed that as long as the table size is bounded independently of

This shows that a sorted table together with binary search for queries is an optimal scheme, in this restricted setting.

It is worth noting that in the same paper, Yao[3] also showed, that if the problem is relaxed to allow the data structure to store

time to update each node in the leaf to root path, and Sum similarly requires

time to traverse the tree from leaf to root summing the values of all subtrees left of the query index.

Improving on a result of Fredman and Saks, Mihai Pătraşcu used the cell-probe model and an information transfer argument to show that the partial sums problem requires

[1][2] He further exhibited the trade-off curve between update time and query time and investigated the case that updates are restricted to small numbers (of

In the disjoint-set data structure, the structure represents a collection of disjoint sets; there is an update operation, called Union, which unites two sets, and a query operation, called Find, which identifies the set to which a given element belongs.

Fredman and Saks proved that in the model CPROBE(log n), any solution for this problem requires

probes in the worst case (even in expectation) to execute

An approximate version of this problem is often considered since many applications of this problem are in very high dimension spaces and solving the problem in high dimensions requires exponential time or space with respect to the dimension.

[6] Chakrabarti and Regev proved that the approximate nearest neighbor search problem on the Hamming cube using polynomial storage and

word size requires a worst-case query time of

This proof used the cell-probe model and information theoretic techniques from communication complexity.

The idealized random access machine used as a computational model in Computer Science does not impose a limit on the contents of each cell (in contrast to the word RAM).

Certain techniques for cell-probe lower bounds can, however, be carried over to the idealized RAM with an algebraic instruction set and similar lower bounds result.