XML

Although the design of XML focuses on documents, the language is widely used for the representation of arbitrary data structures,[8] such as those used in web services.

Further guidelines for the use of XML in a networked context appear in RFC 3470, also known as IETF BCP 70, a document covering many aspects of designing and deploying an XML-based language.

One of the applications of XML in science is the representation of operational meteorology information based on IWXXM standards.

In the case of C1 characters, this restriction is a backwards incompatibility; it was introduced to allow common encoding errors to be detected.

Unicode itself defines encodings that cover the entire repertoire; well-known ones include UTF-8 (which the XML standard recommends using, without a BOM) and UTF-16.

A user whose keyboard offers no method for entering this character could still insert it in an XML document encoded either as 中 or 中.

[21] XML's policy in this area has been criticized as a violation of Postel's law ("Be conservative in what you send; be liberal in what you accept").

Such schema languages typically constrain the set of elements that may be used in a document, which attributes may be applied to them, the order in which they may appear, and the allowable parent/child relationships.

The oldest schema language for XML is the document type definition (DTD), inherited from SGML.

RELAX NG has a simpler definition and validation framework than XML Schema, making it easier to use and implement.

DSDL includes RELAX NG full and compact syntax, Schematron assertion language, and languages for defining datatypes, character repertoire constraints, renaming and entity expansion, and namespace-based routing of document fragments to different validators.

DTDs and XSDs both have this ability; they can for instance provide the infoset augmentation facility and attribute defaults.

Some other specifications conceived as part of the "XML Core" have failed to find wide adoption, including XInclude, XLink, and XPointer.

Tree-traversal and data-binding APIs typically require the use of much more memory, but are often found more convenient for use by programmers; some include declarative retrieval of document components via the use of XPath expressions.

XSLT is designed for declarative description of XML document transformations, and has been widely implemented both in server-side packages and Web browsers.

SAX is fast and efficient to implement, but difficult to use for extracting information at random from the XML, since it tends to burden the application author with keeping track of what part of the document is being processed.

Pull parsing treats the document as a series of items read in sequence using the iterator design pattern.

[25] Examples of pull parsers include Data::Edit::Xml in Perl, StAX in the Java programming language, XMLPullParser in Smalltalk, XMLReader in PHP, ElementTree.iterparse in Python, SmartXML in Red, System.Xml.XmlReader in the .NET Framework, and the DOM traversal API (NodeIterator and TreeWalker).

A pull parser creates an iterator that sequentially visits the various elements, attributes, and data in an XML document.

This can make it easier to write correct and efficient code, and reduce the risk of errors and bugs.

[28] The Resource Description Framework defines a data type rdf:XMLLiteral to hold wrapped, canonical XML.

[29] Facebook has produced extensions to the PHP and JavaScript languages that add XML to the core syntax in a similar fashion to E4X, namely XHP and JSX respectively.

[30] The versatility of SGML for dynamic information display was understood by early digital media publishers in the late 1980s prior to the rise of the Internet.

Dan Connolly added SGML to the list of W3C's activities when he joined the staff in 1995; work began in mid-1996 when Sun Microsystems engineer Jon Bosak developed a charter and recruited collaborators.

[35] James Clark served as Technical Lead of the Working Group, notably contributing the empty-element syntax and the name "XML".

Halfway through the project, Bray accepted a consulting engagement with Netscape, provoking vociferous protests from Microsoft.

This led to intense dispute in the Working Group, eventually solved by the appointment of Microsoft's Jean Paoli as a third co-editor.

Other sources of technology for XML were the TEI (Text Encoding Initiative), which defined a profile of SGML for use as a "transfer syntax" and HTML.

The notion of well-formedness as opposed to validity (which enables parsing without a schema) was first formalized in XML, although it had been implemented successfully in the Electronic Book Technology "Dynatext" software;[38] the software from the University of Waterloo New Oxford English Dictionary Project; the RISP LISP SGML text processor at Uniscope, Tokyo; the US Army Missile Command IADS hypertext system; Mentor Graphics Context; Interleaf and Xerox Publishing System.

This is accomplished by automatically creating a mapping between elements of the XML schema XSD of the document and members of a class to be represented in memory.