Markup language

A markup language is a text-encoding system which specifies the structure and formatting of a document and potentially the relationships among its parts.

[2] Older markup languages, which typically focus on typography and presentation, include Troff, TeX, and LaTeX.

Scribe and most modern markup languages, such as XML, identify document components (for example headings, paragraphs, and tables), with the expectation that technology, such as stylesheets, will be used to apply formatting or other processing.

For centuries, this task was done primarily by skilled typographers known as "markup men"[5] or "markers"[6] who marked up text to indicate what typeface, style, and size should be applied to each part, and then passed the manuscript to others for typesetting by hand or machine.

The markup was also commonly applied by editors, proofreaders, publishers, and graphic designers, and indeed by document authors, all of whom might also mark other things, such as corrections, changes, etc.

The first well-known public presentation of markup languages in computer text processing was made by William W. Tunnicliffe at a conference in 1967, although he preferred to call it generic coding.

[10] Brian Reid, in his 1980 dissertation at Carnegie Mellon University, developed the theory and a working implementation of descriptive markup in actual use.

Goldfarb hit upon the basic idea while working on a primitive document management system intended for law firms in 1969, and helped invent IBM GML later that same year.

In 1975, Goldfarb moved from Cambridge, Massachusetts to Silicon Valley and became a product planner at the IBM Almaden Research Center.

Some early examples of computer markup languages available outside the publishing industry can be found in typesetting tools on Unix systems such as troff and nroff.

The first language to make a clean distinction between structure and presentation was Scribe, developed by Brian Reid and described in his doctoral thesis in 1980.

[13] Scribe was revolutionary in a number of ways, introducing the idea of styles separated from the marked-up document, and a grammar that controlled the usage of descriptive elements.

Scribe influenced the development of Generalized Markup Language (later SGML),[14] and is a direct ancestor to HTML and LaTeX.

From the late '80s onward, most substantial new markup languages have been based on the SGML system, including for example TEI and DocBook.

For example, SGML made end tags (or start-tags, or even both) optional in certain contexts, because its developers thought markup would be done manually by overworked support staff who would appreciate saving keystrokes[citation needed].

Except for the hyperlink tag, these were strongly influenced by SGMLguid, an in-house SGML-based documentation format at CERN, and very similar to the sample schema in the SGML standard.

The Internet Engineering Task Force (IETF) formally defined it as such with the mid-1993 publication of the first proposal for an HTML specification: "Hypertext Markup Language (HTML)" Internet-Draft Archived 2017-01-03 at the Wayback Machine by Berners-Lee and Dan Connolly, which included an SGML Document Type Definition to define the grammar.

[21] Many of the HTML text elements are found in the 1988 ISO technical report TR 9537 Techniques for using SGML, which in turn covers the features of early text formatting languages such as that used by the RUNOFF command developed in the early 1960s for the CTSS (Compatible Time-Sharing System) operating system.

Steven DeRose[22] argues that HTML's use of descriptive markup (and the influence of SGML in particular) was a major factor in the success of the Web, because of the flexibility and extensibility that it enabled.

It appeared to strike a happy medium between simplicity and flexibility, as well as supporting very robust schema definition and validation tools, and was rapidly adopted for many other uses.

Many XML-based applications now exist, including the Resource Description Framework as RDF/XML, XForms, DocBook, SOAP, and the Web Ontology Language (OWL).

A screenshot of an XML file.
Example of RecipeML , a simple markup language based on XML for creating recipes. The markup can be converted programmatically for display into, for example, HTML , PDF or Rich Text Format .