Chinese character classification

A small number of characters originate as pictographs and ideographs, but the vast majority are what are called phono-semantic compounds, which involve an element of pronunciation in their meaning.

A traditional six-fold classification scheme was originally popularized in the 2nd century CE, and remained the dominant lens for analysis for almost two millennia, but with the benefit of a greater body of historical evidence, recent scholarship has variously challenged and discarded those categories.

In older literature, Chinese characters are often referred to as "ideographs", inheriting a historical misconception of Egyptian hieroglyphs.

In some special cases, characters may denote non-morphemic syllables as well; due to this, written Chinese is often characterised as morphosyllabic.

[3][a] Logographs may be contrasted with letters in an alphabet, which generally represent phonemes, the distinct units of sound used by speakers of a language.

[5] Despite their origins in picture-writing, Chinese characters are no longer ideographs capable of representing ideas directly; their comprehension relies on the reader's knowledge of the particular language being written.

[6] The areas where Chinese characters were historically used—sometimes collectively termed the Sinosphere—have a long tradition of lexicography attempting to explain and refine their use; for most of history, analysis revolved around a model first popularized in the 2nd-century Shuowen Jiezi dictionary.

In support of this second reading, he points to other characters with the same 女 component that had similar pronunciations in Old Chinese: 妟; yàn ← *‍ʔrans 'tranquil', 奻; nuán ← *‍nruan 'to quarrel' and 姦; jiān ← *‍kran 'licentious'.

[29] Other scholars reject these arguments for alternative readings and consider other explanations of the data more likely, for example viewing 妟 as a reduced form of 晏, which can be analysed as a phono-semantic compound with 安 as phonetic.

[30] Notably, Christopher Button has shown how more sophisticated palaeographical and phonological analyses can account for the examples of Boodberg and Boltz without relying on polyphony.

[31] While compound ideographs are a limited source of Chinese characters, they form many kokuji created in Japan to represent native words.

Some loangraphs (假借; jiǎjiè; 'borrowing') are introduced to represent words previously lacking another written form—this is often the case with abstract grammatical particles such as 之 and 其.

[34] As with Egyptian hieroglyphs and cuneiform, early Chinese characters were used as rebuses to express abstract meanings that were not easily depicted.

However, the barrier between a character's pronunciation and meaning is never total: when transcribing into Chinese, loangraphs are often chosen deliberately as to create certain connotations.

According to Bernhard Karlgren (1889–1978), "One of the most dangerous stumbling-blocks in the interpretation of pre-Han texts is the frequent occurrence of loan characters.

"[38] Phono-semantic compounds (形声; 形聲; xíngshēng; 'form and sound' or 谐声; 諧聲; xiéshēng; 'sound agreement') represent most of the modern Chinese lexicon.

The verb mù could have simply been written 木, but to disambiguate it was compounded with the character for 'water', which gives some idea of the word's meaning.

[39] Nonetheless, all characters containing 俞 are pronounced in Standard Chinese as various tonal variants of yu, shu, tou, and the closely related you and zhu.

Basic examples of pure signs are found with the numerals beyond four, e.g. 五 ('five') and 八 ('eight'), whose forms do not give visual hints to the quantities they represent.

A common portmanteau is 甭 (béng; 'needn't'), which is a graphical ligature of 不用 (bùyòng) that is pronounced as a fusion of bù and yòng.

However, this character was also created at an earlier date as 甭 (qì; 'to abandon'), where it instead functions as a true compound ideograph that represents a single unrelated morpheme.

The Shuowen Jiezi ultimately popularized the six category model which would serve as the foundation of traditional Chinese lexicography for the next two millennia.

Xu was not the first to use the term: it first appeared in the Rites of Zhou (2nd century BCE), though it may not have originally referred to methods of creating characters.

When Liu Xin (d. 23 CE) edited the Rites he used the term 'six categories' alongside a list of six character types, but he did not provide examples.

Oracle bone script is the direct ancestor of modern written Chinese, and is already a mature writing system in its earliest attestation.

Despite millennia of change in shape, usage, and meaning, a few of these characters remain recognizable to modern Chinese readers.

Xu gave the example of 考 kǎo 'to verify' with 老 lǎo 'old', which had similar Old Chinese pronunciations of *‍khuʔ and *‍C-ruʔ[e] respectively.