Script (Unicode)

In Unicode, a script is a collection of letters and other written signs used to represent textual information in one or more writing systems.

[4] When multiple languages make use of the same script, there are frequently some differences, particularly in diacritics and other marks.

Despite these peripheral differences in the Swedish and English writing systems, they are said to use the same Latin script.

The term complex system is sometimes used to describe those where the admixture makes classification problematic.

Such titlecase ligatures are all in the Latin and Greek scripts and are all compatibility characters, and therefore Unicode discourages their use by authors.

A few scripts do differentiate between uppercase and lowercase however: Latin, Cyrillic, Greek, Armenian, Georgian, and Deseret.

Scripts can also contain any other general category character such as marks (diacritic and otherwise), numbers (numerals), punctuation, separators (word separators such as spaces), symbols and non-graphical format characters.

There are script codes defined by ISO 15924 but are not used in Unicode, including Zsym (Symbols) and Zmth (Mathematical notation).

The project Missing Scripts—with contributors from the Mainz University of Applied Sciences, the L’Atelier national de recherche typographique (ANRT) in Nancy, and the University of California, Berkeley—has compiled a list of 131 scripts that have not yet been encoded in The Unicode Standard, out of a total of 294 recognized scripts according to the current state of research.