List of XML and HTML character entity references

The hhhh (or nnnn) may be any number of hexadecimal (or decimal) digits and may include leading zeros.

The entity must either be predefined (built into the markup language), or otherwise explicitly declared in a Document Type Definition (DTD) (see [a]).

Their full formal public identifiers are as follows: HTML5 defines many named entities, references to which act as mnemonic aliases for certain Unicode characters.

[5] The HTML5 specification does not allow users to define additional entities, as it no longer accepts any DTD to be referenced or extended inside HTML documents (this is still needed in XHTML, which is based on stricter XML parsing rules but allows referencing or defining a DTD in the document header, because XML does not predefine most HTML entities).

Notably, there are no predefined HTML character entities for controls that were added in the UCS/Unicode and formally defined in version 2 of the Unicode Bidi Algorithm.

However, all valid characters and sequences in the UCS, including all bidirectional controls or private-use assignments (but with the exception of non-whitespace C0 and C1 controls, non-characters, and surrogates) are also usable and valid in HTML, XML, XHTML and MathML, either in plain-text values of attributes or in text elements (by encoding them directly as plain text, or using numeric character references when needed).