Shift JIS (also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts)[2][3] is a character encoding for the Japanese language, originally developed by the Japanese company ASCII Corporation[b] in conjunction with Microsoft and standardized as JIS X 0208 Appendix 1.
As of January 2025[update], less than 0.05% of surveyed web pages used Shift JIS (actually decoded as its superset Windows-31J encoding), a decline from 1.3% in July 2014.
[4] Shift JIS is the third-most declared character encoding for Japanese websites (though in effect it means its superset Windows-31J is used, so it is third-most popular), declared by 1.0% of sites in the .jp domain, while UTF-8 is used by 99% of Japanese websites.
[5][6] Shift JIS is also sometimes used in QR codes (they are a Japanese invention also allowing UTF-8, which may though be preferred use).
is: The competing 8-bit format EUC-JP, which does not support single-byte halfwidth katakana, allows for a cleaner and more direct conversion to and from JIS X 0208 code points, as all high-bit-set bytes are parts of a double-byte character and all codes from ASCII range represent single-byte characters.
Shift JIS can be used in string literals in programming languages such as C, but a few things must be taken into consideration.
Firstly, that the escape character 0x5C, normally backslash, is the half-width yen sign (¥) in Shift JIS.
If the programmer is aware of this, it would be possible to use printf("ハローワールド¥n"); (where ハローワールド is Hello, world and ¥n is an escape sequence), assuming the I/O system supports Shift JIS output.
[12] However, most localised fonts on Windows display U+005C as a Yen sign for JIS X 0201 compatibility.
[15] Windows codepage 932 is the version used in the W3C/WHATWG encoding standard used by HTML5, which includes the "formerly proprietary extensions from IBM and NEC" from Windows-31J in its table for JIS X 0208,[16] and also treats the label "shift_jis" interchangeably with "windows-31j" with the intent of being "compatible with deployed content".
It also extended JIS X 0201 by assigning the backslash to 0x80 (corresponding to 0x5C in US-ASCII), the non-breaking space to 0xA0, the copyright sign to 0xFD, the trademark symbol to 0xFE and the half-width horizontal ellipsis to 0xFF.
It also added extended double byte characters; including 53 vertical presentation forms in the Shift_JIS range 0xEB41–0xED96, at 84 JIS rows down from their canonical forms, and 260 special characters in the Shift_JIS range 0x8540–0x886D.
Sai Mincho and Chu Gothic use a "PostScript" variant of MacJapanese, which included additional vertical presentation forms and a different set of extended special characters, based on the NEC special characters, some of which were only available in the printer versions of the fonts.
[18] Older versions of Maru Gothic and Hon Mincho from System 7.1 encoded vertical presentation forms at 10 (not 84) JIS rows down from their canonical forms, and did not include the special character extensions, this was subsequently changed.
[22] In order to represent the allocated rows on both planes of JIS X 0213, Shift_JIS-2004 uses the following method of mapping codepoints.
Some of the additions collide with popular Shift JIS extensions, including Windows codepage 932 which is used in web standards (see above).
For example, compare plane 1 row 89 in JIS X 0213 (beginning 硃, 硎, 硏...)[24] to row 89 in the JIS X 0208 variant defined in web standards (beginning 纊, 褜, 鍈...).
The space with lead bytes 0xF5 to 0xF9 (beyond the region used for JIS X 0208) is used by Japanese mobile phone operators for pictographs for use in E-mail.
[27] Beyond even this, there have been numerous minor variations made on Shift JIS, with individual characters here and there altered.
A variant is the one that must be used if wanting to encode Shift JIS in source code strings of C and similar programming languages.