Relational model

The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd,[1][2] where all data are represented in terms of tuples, grouped into relations.

The purpose of the relational model is to provide a declarative method for specifying data and queries: users directly state what information the database contains and what information they want from it, and let the database management system software take care of describing data structures for storing the data and retrieval procedures for answering queries.

A table in a SQL database schema corresponds to a predicate variable; the contents of a table to a relation; key constraints, other constraints, and SQL queries correspond to predicates.

In their 1995 The Third Manifesto, Date and Darwen try to demonstrate how the relational model can accommodate certain "desired" object-oriented features.

[4] Some years after publication of his 1970 model, Codd proposed a three-valued logic (True, False, Missing/NULL) version of it to deal with missing information, and in his The Relational Model for Database Management Version 2 (1990) he went a step further with a four-valued logic (True, False, Missing but Applicable, Missing but Inapplicable) version.

The heading defines a set of attributes, each with a name and data type (sometimes called a domain).

[7]: 112–113 In this model, databases follow the Information Principle: At any given time, all information in the database is represented solely by values within tuples, corresponding to attributes, in relations identified by relvars.

[7]: 91 In general, constraints are expressed using relational comparison operators, of which just one, "is subset of" (⊆), is theoretically sufficient.

[7]: 34 Users (or programs) request data from a relational database by sending it a query.

Conceptually, this is done by taking all possible combinations of rows (the Cartesian product), and then filtering out everything except the answer.

Depending on which other sources you consult, there are a number of other operators – many of which can be defined in terms of those listed above.

This has made the idea and implementation of relational databases very popular with businesses.

The body of a relation is a subset of these tuples, representing which propositions are true.

Relational algebra is a set of logical rules that can validly infer conclusions from these propositions.

If this tuple exists in the relation's body, the proposition is true (there is such an employee).

A candidate key is a unique identifier enforcing that no tuple will be duplicated; this would make the relation into something else, namely a bag, by violating the basic definition of a set.

The DBMS must reject a transaction such as this that would render the database inconsistent by a violation of an integrity constraint.

By joining relvars from the example above we could query the database for all of the Customers, Orders, and Invoices.

If we only wanted the tuples for a specific customer, we would specify this using a restriction condition.

Attributes are commonly represented as columns, tuples as rows, and relations as tables.

A database relvar (relation variable) is commonly known as a base table.

SQL uses a Null value to indicate missing data, which has no analog in the relational model.

Our first definition concerns the notion of tuple, which formalizes the notion of row or record in a table: The next definition defines relation that formalizes the contents of a table as it is defined in the relational model.

It tells us that in every instance of a certain relational schema the tuples can be identified by their values for certain attributes.

In contrast with the relational model, which cannot express recursive queries without introducing a least-fixed-point operator,[11] recursive relations can be defined in Datalog, without introducing any new logical connectives or operators.