Code page 942 (abbreviated as CP942 or IBM-942) is one of IBM's extensions of Shift JIS. The coded character sets are JIS X 0201, JIS X 0208, IBM extensions for IBM 1880 UDC and IBM extensions. It is the combination of the single-byte Code page 1041 and the double-byte Code page 301.[1]

It is a superset of IBM-932, differing in its use of Code page 1041 in place of Code page 897 for its single byte codes. Code page 1041 is an extension of Code page 897 and adds five single-byte characters.[2] 0x80 is mapped to the cent sign (¢), 0xA0 is mapped to the pound sign (£), 0xFD is mapped to the not sign (¬), 0xFE is mapped to the backslash (\) and 0xFF is mapped to the tilde (~).[3] These are all unassigned in Code page 897 and therefore IBM-932.[4]

Code page 942 contains standard 7-bit ISO 646 codes, and Japanese characters are indicated by the high bit of the first byte being set to 1. Some code points in this page require a second byte, so characters use either 8 or 16 bits for encoding.

Code page 1041, and therefore Code page 942, uses 0x5C for the Yen sign (¥) and 0x7E for the overline (),[3] matching the lower half of JIS X 0201 rather than US-ASCII. However, the version of Code page 942 used in International Components for Unicode (called "ibm-942_P12A-1999" or "x-IBM942C") uses US-ASCII mappings for single-byte characters between 0x20 and 0x7E. This results in duplicate mapping for the tilde (0x7E and 0xFF) and the backslash (0x5C and 0xFE).[5]

Layout

First byte
0 1 2 3 4 5 6 7 8 9 A B C D E F
0
1
2  ! " # $  % & ' ( ) * + , - . /
3 0 1 2 3 4 5 6 7 8 9  :  ; < = >  ?
4 @ A B C D E F G H I J K L M N O
5 P Q R S T U V W X Y Z [ ¥ ] ^ _
6 ` a b c d e f g h i j k l m n o
7 p q r s t u v w x y z { | }
8 ¢
9
A £
B ソ
C
D
E
F ¬ \ ~
Second byte
0 1 2 3 4 5 6 7 8 9 A B C D E F
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
 
Non printable ASCII character
Unaltered ASCII character
Modified ASCII character
Single-byte half-width katakana
First byte of a double-byte character, used by JIS X 0208
Not used as first byte, unallocated space in JIS X 0208
First byte of a double-byte IBM extension character
First byte of a double-byte IBM-designated user defined character
IBM single byte extensions
Second byte of a double-byte character whose first half of the JIS sequence was odd
Second byte of a double-byte character whose first half of the JIS sequence was even
Unused as second byte of a double-byte character

See also

References

  1. "Coded character set identifiers - CCSID 942". IBM Globalization. IBM. Archived from the original on 2016-03-15.
  2. "Code page identifiers - CP 01041". IBM Globalization. Archived from the original on 2016-06-01.
  3. 1 2 "CP01041.txt". IBM. Archived from the original on 2019-01-12.
  4. "CP00897.txt". IBM. Archived from the original on 2019-01-12. Retrieved 2017-11-08.
  5. "Converter Explorer: ibm-942_P12A-1999". ICU Demonstration. International Components for Unicode.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.