[3] There are a variety of novel aspects of modern Chinese characters, including that of orthography, phonology, and semantics, as well as matters of collation and organization and statistical analysis, computer processing, and pedagogy.
[7] His paper was followed by Gao Jiaying's "A Brief Discussion on the Establishment of Modern Chinese Character Studies", [8] and other related writings on the subject.
Both lists were released by the Ministry of Education, with a total of 11,149 characters of the Traditional Chinese writing system.
In Japan, there are the jōyō kanji—a list of 2,136 frequently used characters designated by the Japanese Ministry of Education, as well as 983 jinmeiyō kanji for use in personal names.
[29] The first person to make a statistic study on the frequency of Chinese characters was Chen Heqin (陳鶴琴).
[31] The 10 most frequently-used characters in their corpus are, by descending frequency, 的 ('of'), 不 ('no', 'not'), 一 ('one', 'an'), 了 (PERF), 是 (the copula), 我 (I/me), 上 (on, up), 他; 'he', 'him', 有 (to have), 人 ('person').
The frequency data came from a grand corpus with a number of sub-corpora representing the Chinese languages in the three regions of Hong Kong, mainland China and Taiwan and in the two time periods of the 1960s and 1980s–90s.
From the data of these frequency lists, some important and interesting features of Chinese can be discovered: Large-scale surveys by the Ministry of Education and the State Language Commission of PRC over the years have shown that the use of Chinese characters and words has a strong distribution pattern.
For example, the Simplified Chinese version of Microsoft Word allows setting font sizes by either points or numbers.
For example, the English word 'ton' is transliterated as 吨; 噸, with two pronunciations of dūn and dùn coexisting in some old dictionaries, both sharing the meaning of 'ton'.
[54] In Taiwan, there is a similar official standard for Mandarin words with variant sounds, where pronunciations are expressed in bopomofo instead of pinyin.
[58] Zhou Youguang introduced two ways homophones have been historically reduced:[59][60] There are two systems for phonetic notation of Chinese characters.
The Jyutping system for Cantonese uses numbers, e.g. 香港; hoeng1 gong2 Kun'yomi are readings of kanji using native Japanese words mapped to the meanings of borrowed Chinese characters.
[66] For example, the original meaning of 兵; bīng is 'weapon' (斤; jīn; 'cutting knife') being held with both hands 八).
[67] For example, the original meaning of 其; qì is 'dustpan': its use as a third-person pronoun is due to a phonetic loan.
The knowledge of synonym characters will help students write Chinese more correctly and express meanings more accurately.
In June 2013, the List of Commonly Used Standard Chinese Characters was released by the State Council of China.
The other is to follow the original form and meaning, based on the character creation method and etymology, especially the Shuowen Jiezi.
The principles for choosing replacement characters are: [94] [95] From March 1955 to August 1964, 35 place names of county level or above were changed with the approval of the State Council.
[99] In ancient times, research on Chinese character teaching focused on the preparation of various centralized literacy textbooks and dictionaries.
Among them, the ones with greater impact include: [100] The previous three books then developed into a set of teaching materials, collectively called "Three Hundred Thousand" (三百千, about 2,000 different characters), which were used for over 1000 years until the end of the Qing Dynasty, and still have a certain influence today.
Another influential literacy textbook is "Wenzi Mengqiu" (文字蒙求) compiled for children by the Qing Dynasty writer Wang Jun (1784-1854), which contains 2,049 characters.
[108] Teaching Chinese characters as a foreign language has received more and more attention, and many textbooks and elective courses in this area have appeared.
The input code of a Chinese character is its pinyin letter string followed by an optional number representing the tone.
Popular form-based encoding methods include Wubi (五笔) in the Mainland and Cangjie (倉頡) in Taiwan and Hong Kong.
[112] The most important feature of intelligent input is the application of contextual constraints for candidate character selection.
Though the non-toned Pinyin letters of 大学 and 大雪 are both "daxue", the computer can make a reasonable selection based on the subsequent words.
It includes 6,763 Chinese characters, with 3,755 frequently-used ones sorted by Pinyin, and the rest by radicals (indexing components).
Each character is encoded with a two byte hexadecimal code, for example, 香 (ADBB) 港 (B4E4) 龍 (C073).
The Basic Multilingual Plane (BMP) is a 2-byte kernel version of Unicode with 2^16=65,536 code points for important characters of many languages.