Datalog

While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model.

Datalog has been applied to problems in data integration, networking, program analysis, and more.

The following table maps between Datalog, relational algebra, and SQL concepts: More formally, non-recursive Datalog corresponds precisely to unions of conjunctive queries, or equivalently, negation-free relational algebra.

[1] If constant and variable are two countable sets of constants and variables respectively and relation is a countable set of predicate symbols, then the following BNF grammar expresses the structure of a Datalog program: Atoms are also referred to as literals.

[3] Note that under this definition, Datalog does not include negation nor aggregates; see § Extensions for more information about those constructs.

The set of tuples computed by evaluating the Datalog program is called the intensional database or IDB.

[5] The model-theoretic semantics define the minimal Herbrand model to be the meaning of the program.

The least-fixed-point semantics define the least fixed point of T to be the meaning of the program; this coincides with the minimal Herbrand model.

The proof-theoretic semantics defines the meaning of a Datalog program to be the set of facts with corresponding proof trees.

A top-down reading of the proof trees described above suggests an algorithm for computing the results of such queries.

This reading informs the SLD resolution algorithm, which forms the basis for the evaluation of Prolog.

As mentioned above, each non-recursive Datalog rule corresponds precisely to a conjunctive query.

Extensions implemented in some Datalog engines, such as algebraic data types, can even make the resulting language Turing-complete.

Several extensions have been made to Datalog, e.g., to support negation, aggregate functions, inequalities, to allow object-oriented programming, or to allow disjunctions as heads of clauses.

For example, the language that results from adding negation with the stable model semantics is exactly answer set programming.

When we consider ordered databases, i.e., databases with an order relation on their active domain, then the Immerman–Vardi theorem implies that the expressive power of Datalog is precisely that of the class PTIME: a property can be expressed in Datalog if and only if it is computable in polynomial time.

It is not Turing-complete, and doesn't include basic data types such as integers or strings.

Datalog has been applied to problems in data integration, information extraction, networking, security, cloud computing and machine learning.

[45] The Soufflé dialect has been used to write pointer analyses for Java and a control-flow analysis for Scheme.

[46][47] Datalog has been integrated with SMT solvers to make it easier to write certain static analyses.

[49] Some widely used database systems include ideas and algorithms developed for Datalog.

Proof tree showing the derivation of the ground atom path(x, z) from the program
edge(x, y).
edge(y, z).
path(A, B) :- 
  edge(A, B).
path(A, C) :- 
  path(A, B), 
  edge(B, C).
A parallel Datalog engine was evaluated on the Theta supercomputer at Argonne National Laboratory . [ 9 ]