[citation needed] Range sum queries may be answered in constant time and linear space by pre-computing an array p of same length as the input such that for every index i, the element pi is the sum of the first i elements of a.
[citation needed] When the function of interest in a range query is a semigroup operator, the notion of
Andrew Yao showed[3] that there exists an efficient solution for range queries that involve semigroup operators.
Several data structures have been devised to solve this problem, we summarize some of the results in the following table.
[1] Recently Jørgensen et al. proved a lower bound on the cell-probe model of
Range median queries cannot be solved by following any of the previous methods discussed above including Yao's approach for semigroup operators.
The following pseudocode of the quickselect algorithm shows how to find the element of rank r in
an unsorted array of distinct elements, to find the range medians we set
recursion calls are done and only a constant number of operations are performed in each of them (to get the value of t fractional cascading should be used).
[7] Finding frequent elements in a given set of items is one of the most important tasks in data mining.
Finding frequent elements might be a difficult task to achieve when most items have similar frequencies.
Boyer and Moore proposed an algorithm to find the majority element of a string (if it has one) in
In the context of Boyer and Moore’s work and generally speaking, a majority element in a set of items (for example string or an array) is one whose number of instances is more than half of the size of that set.
Few years later, Misra and Gries [10] proposed a more general version of Boyer and Moore's algorithm using
comparisons to find all items in an array whose relative frequencies are greater than some threshold
, returns the set of all distinct items that appear more than (or in some publications equal to)
-majority queries on one-dimensional arrays by finding the Lowest common ancestor (LCA) of the endpoints of the query range in a Range tree and validating two sets of candidates (of size
) from each endpoint to the lowest common ancestor in constant time resulting in
Gagie et al.[12] proposed a data structure that supports range
are specified, and the set of all elements that have relative frequencies (inside that rectangular range) greater than or equal to
Many other data structures (as discussed below) have proposed methods for verifying each candidate in constant time and thus maintaining the
Chan et al.[13] proposed a data structure that given a one-dimensional array
To answer such queries, Chan et al.[13] begin by noting that there exists a data structure capable of returning the top-k most frequent items in a range in
Chan et al.[13] first construct a range tree in which each branching node stores one copy of the data structure described above for one-sided range top-k queries and each leaf represents an element from
Each candidate is then assessed in constant time using a linear-space data structure (as described in Lemma 3 in [15]) that is able to determine in
Gagie et al.[16] proposed a data structure which supports queries such that, given two nodes
in a tree, are able to report the list of elements that have a greater relative frequency than
, this does not increase the space complexity since it only adds a constant number of words per node.
Please note that, boundary nodes have to be handled accordingly so that all of these subpaths are disjoint and from all of them a set of
On the other hand, range queries might be extended to other data structures like trees,[8] such as the level ancestor problem.