Formal Public Identifier

Some of their most common uses are as part of document type declarations (DOCTYPEs) and document type definitions (DTDs) in SGML, XML and historically HTML, but they are also used in the vCard and iCalendar file formats to identify the software product which generated the file.

[1]: 384–385  Owners which use unregistered identifiers include the W3C (-//W3C),[6]: 8–9  the Internet Engineering Task Force (-//IETF),[7] the United States Department of Defense (-//USA-DOD),[8] the European Parliament (-//EP)[9] and others.

[2][10]: 63 A registered owner prefix conforming to ISO 9070 may be one of the following:[2] Text identifiers can be broken down into the class, description and language.

The text identifier may optionally contain a version indicator after the language, also separated by a double slash.

[14] For all other FPIs (i.e. those where the class is not CHARSET), the part following the description is a public text language which is a sequence of uppercase letters, strongly encouraged (but not mandated) to be an ISO 639-1 code.

[1]: 387–388  Stopping short of mandating the use of an ISO 639-1 code avoids requiring validating software to check whether the language is an ISO 639-1 code, and also allows for extensibility:[1]: 387  for example, a small number of FPIs used in practice use ISO 639-3 codes (such as NDS for Low German)[15] or IETF language tags with hyphens removed (such as SRLATN for Serbian written in Gajica)[16] for cases where ISO 639-1 codes prove insufficient for distinguishing a resource from versions in other languages or language varieties.

[17] Additionally, except for CHARSET, CAPACITY, NOTATION and SYNTAX FPIs, for which the designating sequence or language must be the final part,[1]: 390  the language code may be followed by another // pair,[1]: 385  followed by a public text display version, which specifies a particular platform that the implementation of SGML entities should target.

[29] The Formal Public Identifier's effect upon its host document is unusual in that it can depend not only upon its own syntactical correctness and the behaviour of the program parsing it, but also upon the ISO-registration status of the organisation responsible for schema referenced by the FPI.

[1]: 186 Although the constraints of formal (as opposed to informal) public identifiers are an optional feature, due to the specification for FPIs being introduced late in the development of ISO 8879, use of FPIs for public identifiers is strongly recommended, since the FPI structure ensures that the FPIs assigned by one owner do not collide with FPIs assigned by other owners (except in the case of unregistered owners with colliding names), while informal public identifiers have no uniqueness guarantee, meaning that those assigned by one owner may collide with formal or informal public identifiers assigned by another.

[1]: 186  A feature enabling the interpretation of public identifiers using the formal structure, thus requiring public identifiers to be FPIs, can be enabled within the SGML declaration using the FORMAL feature name.

[1]: 64, 88, 378 System identifiers, by contrast, have no structure defined by SGML itself—they might be filenames, database keys or even addresses for indexable storage—but are interpreted by the SGML system's entity manager component to identify the location of the entity.

[1]: 378  As such, ISO/IEC 8879 itself does not use the term formal system identifier (FSI), which is instead defined in an amendment to ISO/IEC 10744 (HyTime).

XSD can (unlike DTDs) be validated using the same tools as any other XML document,[40] includes support for XML namespaces (which DTDs can only interpret as fixed portions of the element and attribute names in question),[41] allows regular expression constraints to be placed on the format of text data such as telephone numbers, and is better able to express complex content-model structures.

[40] Thus, it is less common for XML formats to use a DTD (such as which might use FPIs for notations or external entities), and thus less common for one to contain a DOCTYPE referencing a DTD (either by FPI or only by URI—although a DOCTYPE may still be used for entity definitions embedded within the XML file itself).

[43] Similarly, the DocBook format, which initially used a document type declaration identifying a DTD by an FPI, switched its primary schema definition from DTD to RELAX NG in version 5.0, and ceased to use document type declarations at that time,[44] and Scalable Vector Graphics (SVG) did the same in version 1.2.

[50] The above DTD FPI mapping is represented as follows:[49] HTML versions 2 through 4 (including the XML-based XHTML 1.x) were defined as profiles of SGML, and specified with an SGML declaration and a document type definition (DTD).

[6]: 6–7  One example of such a custom system identifier without an associated FPI is:[52] Since they were principally intended for use by SGML validators, document type declarations were initially ignored by browsers.

[37]: 17–18 "Quirks mode" retained legacy behaviour from earlier browser versions to avoid breaking existing pages—for example, Internet Explorer versions 6 and 7 would render the page using the Internet Explorer 5.5 box model.

[54] The XML representation (XHTML), by contrast, is permitted but not required to bear any DOCTYPE, but no validating DTD is provided for the HTML 5 schema.

The WHATWG HTML standard specifies a list of which FPIs should trigger quirks mode.

Mostly, these are specified as prefixes including the owner, class and description (but matching any language part).

[52] Increasingly, specifications use URIs rather than FPIs to handle the task of unique identification.

A Uniform Resource Name (URN) namespace has been defined to allow any FPI to be rewritten as a URI,[11] replacing double slashes with colons.