Algebraic data type

The values of a sum type are typically grouped into several classes, called variants.

A value of a variant type is usually created with a quasi-functional entity called a constructor.

Algebraic data types were introduced in Hope, a small functional programming language developed in the 1970s at the University of Edinburgh.

[2] One of the most common examples of an algebraic data type is the singly linked list.

Here is an example of how a singly linked list would be declared in Haskell: or Cons is an abbreviation of construct.

For example, Haskell and ML use [] for Nil, : or :: for Cons, respectively, and square brackets for entire lists.

For a slightly more complex example, binary trees may be implemented in Haskell as follows: or Here, Empty represents an empty tree, Leaf represents a leaf node, and Node organizes the data into branches.

For example, the data constructor Leaf is logically a function Int -> Tree, meaning that giving an integer as an argument to Leaf produces a value of the type Tree.

As Node takes two arguments of the type Tree itself, the datatype is recursive.

Operations on algebraic data types can be defined by using pattern matching to retrieve the arguments.

Algebraic data types are highly suited to implementing abstract syntax.

Writing an evaluation function for this language is a simple exercise; however, more complex transformations also become feasible.

Each type of thing is associated with an identifier called a constructor, which can be considered a tag for that kind of data.

To do something with a value of this Tree algebraic data type, it is deconstructed using a process called pattern matching.

Each pattern above has a form that resembles the structure of some possible value of this datatype.

In this case, a lowercase identifier represents a pattern that matches any value, which then is bound to a variable of that name — in this case, a variable "n" is bound to the integer value stored in the data type — to be used in the expression to evaluate.

In the pseudocode example above, programmer diligence is required to not access field2 when the constructor is a Leaf.

Secondly, in pattern matching, the compiler performs exhaustiveness checking to ensure all cases are handled.

Exhaustiveness checking may seem easy for simple patterns, but with many complex recursive patterns, the task soon becomes difficult for the average human (or compiler, if it must check arbitrary nested if-else constructs).

The purpose of both is similar (to extract parts from a piece of data matching certain constraints) however, the implementation is very different.

The Haskell List datatype can also be represented in type theory in a slightly different form, thus:

For the purposes of the List example, these two formulations are not significantly different; but the second form allows expressing so-called nested data types, i.e., those where the recursive type differs parametrically from the original.

(For more information on nested data types, see the works of Richard Bird, Lambert Meertens, and Ross Paterson.)

[3] Many programming languages incorporate algebraic data types as a first class notion, including: