Dynamic perfect hashing

[1][2][3] While more memory-intensive than its hash table counterparts,[citation needed] this technique is useful for situations where fast queries, insertions, and deletions must be made on a large set of elements.

The problem of optimal static hashing was first solved in general by Fredman, Komlós and Szemerédi.

The second-level table is guaranteed to be collision-free (i.e. perfect hashing) upon construction.

[2] In the static case, we are given a set with a total of x entries, each one with a unique key, ahead of time.

Fredman, Komlós and Szemerédi pick a first-level hash table with size

[2] To construct, x entries are separated into s buckets by the top-level hashing function, where

If the hash function randomly selected creates a table with collisions, a new hash function is randomly selected until a collision-free table can be guaranteed.

space ensures that randomly creating a table with collisions is infrequent and independent of the size of k, providing linear amortized construction time.

[1] The first-level hash function is specifically chosen so that, for the specific set of x unique key values, the total space T used by all the second-level hash tables has expected

[2] Dietzfelbinger et al. present a dynamic dictionary algorithm that, when a set of n items is incrementally added to the dictionary, membership queries always run in constant time and therefore

In the dynamic case, when a key is inserted into the hash table, if its entry in its respective subtable is occupied, then a collision is said to occur and the subtable is rebuilt based on its new total entry count and randomly selected hash function.

[2] Additionally, the ultimate sizes of the top-level table or any of the subtables is unknowable in the dynamic case.

space of the table is to prompt a full reconstruction when a sufficient number of insertions and deletions have occurred.

By results due to Dietzfelbinger et al.,[2] as long as the total number of insertions or deletions exceeds the number of elements at the time of last construction, the amortized expected cost of insertion and deletion remain

The implementation of dynamic perfect hashing by Dietzfelbinger et al. uses these concepts, as well as lazy deletion, and is shown in pseudocode below.

During the insertion of a new entry x at j, the global operations counter, count, is incremented.

If x exists at j or at the subtable Tj, and is not marked as deleted, then a collision is said to occur and the jth bucket's second-level table Tj is rebuilt with a different randomly selected hash function hj.

In the case of both insertions and deletions, if count reaches a threshold M the entire table is rebuilt, where M is some constant multiple of the size of S at the start of a new phase.

A full rebuild of the table of S first starts by removing all elements marked as deleted and then setting the next threshold value M to some constant multiple of the size of S. A hash function, which partitions S into s(M) subsets, where the size of subset j is sj, is repeatedly randomly chosen until: