Type system

In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type (for example, integer, floating point, string) to every term (a word, phrase, or other set of symbols).

Type systems have other purposes as well, such as expressing business rules, enabling certain compiler optimizations, allowing for multiple dispatch, and providing a form of documentation.

As Mark Manasse concisely put it:[3] The fundamental problem addressed by a type theory is to ensure that programs have meaning.

The fundamental problem caused by a type theory is that meaningful programs may not have meanings ascribed to them.

The hardware of a general purpose computer is unable to discriminate between for example a memory address and an instruction code, or between a character, an integer, or a floating-point number, because it makes no intrinsic distinction between any of the possible values that a sequence of bits might mean.

Beyond simple value-type pairs, a virtual "region" of code is associated with an "effect" component describing what is being done with what, and enabling for example to "throw" an error report.

Whether automated by the compiler or specified by a programmer, a type system renders program behavior illegal if it falls outside the type-system rules.

In some languages, such as Haskell, for which type inference is automated, lint might be available to its compiler to aid in the detection of error.

To prove the absence of these defects, other kinds of formal methods, collectively known as program analyses, are in common use.

In addition, software testing is an empirical method for finding errors that such a type checker would not detect.

More generally, most programming languages include mechanisms for dispatching over different 'kinds' of data, such as disjoint unions, runtime polymorphism, and variant types.

[11] Conversely, as of version 4.0, the C# language provides a way to indicate that a variable should not be statically type checked.

When the compiler knows the exact data types that are in use (which is necessary for static verification, either through declaration or inference) it can produce optimized machine code.

For example, C++ templates are typically more cumbersome to write than the equivalent Ruby or Python code since C++ has stronger rules regarding type definitions (for both functions and variables).

More advanced run-time constructs such as metaclasses and introspection are often harder to use in statically typed languages.

In some languages, such features may also be used e.g. to generate new types and behaviors on the fly, based on run-time data.

In general, there are more precise terms to represent the differences between type systems that lead people to call them "strong" or "weak".

This is for languages where the type system is not sufficiently advanced to precisely specify the validity of operations on all possible operands.

In C terms this is simply undefined behaviour and the program may do anything; with a simple compiler it might actually print whatever byte is stored after the string "37".

Examples include: Additional tools such as lint and IBM Rational Purify can also be used to achieve a higher level of strictness.

He believes this is advantageous, because what he calls mandatory type systems make languages less expressive and code more fragile.

Type systems that allow polymorphism generally do so in order to improve the potential for code re-use: in a language with polymorphism, programmers need only implement a data structure such as a list or an associative array once, rather than once for each type of element with which they plan to use it.

Frequently, these are based on ideas from formal type theory and are only available as part of prototype research systems.

As this impedes the development process, many language implementations provide an easy way out in the form of an option to disable this condition.

Normally this is not possible, as such mutations could cause side effects on parts of the program holding other references to the object, violating referential transparency.

This gives flexibility for choosing types suited to a particular implementation, while clients that use only values of the interface type—the existential type—are isolated from these choices.

[25] The theory is a second-order typed lambda calculus similar to System F, but with existential instead of universal quantification.

Thus, any call to f elsewhere in the program that specifies a non-numeric type (such as a string or list) as an argument would signal an error.

Most Haskell compilers allow arbitrary-rank polymorphism as an extension, but this makes type inference not computable.

The presence of parametric or ad hoc polymorphism in a language may also have implications for type compatibility.