General Chinese

The result is a syllabary of 2082 syllables, about 80% of which are single morphemes—that is, in 80% of cases there is no difference between GC and standard written Chinese, and in running text, that figure rises to 90–95%, as the most common morphemes tend to be uniquely identified.

About 20 percent of the syllables are homophones under each of which there will be more than one morpheme, [which are traditionally] usually written with different characters [...] The degree of homophony is so low that it will be possible to write text either in literary or colloquial Chinese with the same character for each syllable [...] as has been tested in texts of various styles."

Romanized General Chinese has distinct symbols for the onsets (many of them digraphs, and a few trigraphs) and the rimes distinguished by any of the control dialects.

General Chinese also maintains the "round-sharp [zh]" distinction, such as sia vs. hia, though those are both xia in Beijing Mandarin.

Indeed, Chao characterized GC as having "the initial consonants of the Wu dialects [...], the vowels of Mandarin, and the endings of Cantonese.

"[6] Like Chao's other invention, Gwoyeu Romatzyh, romanized General Chinese uses tone spelling.

This is because Chao ran frequency tests, and used single letters for the most common consonants and vowels, while restricting digraphs and trigraphs to the more infrequent ones.

An example of Romanized General Chinese can be illustrated with Chao's name: * is reduced to in downtown accents.

These pronunciations are all predictable given the General Chinese transcription, though it was not designed with the Sinospheric languages specifically in mind.

The convention ⟨q⟩ for nasal 疑, which drops in many dialects, is repeated in the finals, where it represents *[ŋ] with a departing tone.

Although to some extent systematic—the retroflex series are digraphs ending in ⟨r⟩, for example—this is overridden in many cases by the principle of using short transcriptions for common sounds.

Thus ⟨z⟩ is used for 精 rather than for the less common 邪, where it might also be expected; ⟨v⟩ is used for frequent 微 rather than for 奉; and ⟨c⟩ and ⟨g⟩, for the high-frequency 見 and 羣, have the additional benefit of being familiar in their palatalized forms (Peking ~ Beijing for example is -⟨cieng⟩) from English words like cello and gem.

An exception is Cantonese, where in the rising tone they are aspirated in colloquial speech, but tenuis in reading pronunciations.

They need to be considered as a unit because of a strong historical interaction between vowel and coda in Chinese dialects.

In Wu, Min (generally), New Xiang (Hunanese), Jin, and in the Lower Yangtze and Minjiang dialects of Mandarin, these codas conflate to glottal stop /ʔ/.

In others, such as Gan, they are reduced to [t], [k], while Yue dialects, Hakka, and Old Xiang maintain the original [p], [t], [k] system.

[12] In Cantonese, the simple vowels i u iu o a e are [iː uː yː ɔː aː ɛː], apart from ⟨i⟩ and ⟨iu⟩ after velars, which open to diphthongs, as in ci [kei] and ciu [kɵy].

Diphthongs may vary markedly depending on initial and medial, as in cau [kou], ceau [kaːu], ciau [kiːu], though both ceu ~ cieu are [kɐu], following the general pattern of ⟨e⟩ before a coda (cf.

Cantonese does not have medials, apart from gw, kw, though sometimes it is the nuclear vowel which drops: giung [kʰoŋ], xiong [hoŋ], but giuan [kʰyːn].

The discrepancies are due to an effort to keep frequent syllables short: en-in-un-iun rather than *en-ien-uen-iuen, for example; as well as a reflection of some of the more widespread phonological changes in the rimes.

A zero consonant is treated as voiceless (it is sometimes reconstructed as a glottal stop), so i, iem, uon, iuan are ping yin (Mandarin yī, yān, wān, yuān), whereas yi, yem, won, yuan are ping yang (Mandarin yí, yán, wán, yuán).

In Beijing Mandarin, for example, even tone is split according to voicing, with muddy consonants becoming aspirates: ba, pa, ma, bha → bā, pā, má, pá (and mha → mā).

That is, bhaa and bhah are homonyms in Beijing, as indeed they are in all of Mandarin, in Wu apart from Wenzhounese, in Hakka, and in reading pronunciations of Cantonese.

[16] However, the realization of entering tones in Beijing dialect, and thus in Standard Chinese, is not predictable when a syllable has a voiceless initial such as bat or pat.

In such cases even syllables with the same GC spelling may have different tones in Beijing, though they remain homonyms in other Mandarin dialects, such as Xi'an and Sichuanese.

[17] This is due to historical dialect-mixing in the Chinese capital that resulted in unavoidably idiosyncratic correspondences.

Muddy onsets become aspirates in even and rising tones, but tenuis in departing and entering tones: ba, pa, ma, bha → bā, pā, māh, pāh; baa, paa, maa, bhaa → bá, pá, máh, páh; bah, pah, mah, bhah → ba, pa, mah, bah; bat, pat, mat, bhat → baat, paat, maht, baht.

The character text is no different in GC and standard Chinese, apart from 裏 lǐ, which in any case has now been substituted with Chao's choice of 里 on the Mainland.