Code page 866

From Deep web, the free encyclopedia
  (Redirected from Code page 1125)
Jump to navigation Jump to search
Code page 866
Octets in conformant CP866 ordered by nibbles.png
Language(s)Russian, Bulgarian;
Partial support:
Ukrainian,[a], Belarusian[b]
StandardWHATWG Encoding Standard
ClassificationOEM code page, extended ASCII
ExtendsUS-ASCII
Based onAlternative code page
Other related encoding(s)(See below)

Code page 866 (CP 866, "DOS Cyrillic Russian")[1] is a code page used under DOS and OS/2[2] to write Cyrillic script.[3] It is based on the "alternative code page" (Russian: Альтернативная кодировка) developed in 1986 by a research group at the Academy of Science of the USSR.[4] The code page was widely used during the DOS era because it preserves the pseudographic symbols of code page 437 (unlike either the "Main code page" or Windows-1251) and maintains alphabetical order (although non-contiguously) of Cyrillic letters (unlike KOI8-R). Initially, this encoding was only available in the Russian version of MS-DOS 4.01 (1990) and since MS-DOS 6.22 in any language version.

Not identical, but two very similar encodings are registered in GOST R 34.303-92[5] as KOI-8 N1 and KOI-8 N2 (not to be confused with the original KOI-8).

Character set[edit]

Each character is shown with its equivalent Unicode code point. Only the second half of the table (code points 128–255) is shown, the first half (code points 0–127) being the same as code page 437.

Code page 866[6][7][1][8]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
8_
128
А
0410
Б
0411
В
0412
Г
0413
Д
0414
Е
0415
Ж
0416
З
0417
И
0418
Й
0419
К
041A
Л
041B
М
041C
Н
041D
О
041E
П
041F
9_
144
Р
0420
С
0421
Т
0422
У
0423
Ф
0424
Х
0425
Ц
0426
Ч
0427
Ш
0428
Щ
0429
Ъ
042A
Ы
042B
Ь
042C
Э
042D
Ю
042E
Я
042F
A_
160
а
0430
б
0431
в
0432
г
0433
д
0434
е
0435
ж
0436
з
0437
и
0438
й
0439
к
043A
л
043B
м
043C
н
043D
о
043E
п
043F
B_
176

2591

2592

2593

2502

2524

2561

2562

2556

2555

2563

2551

2557

255D

255C

255B

2510
C_
192

2514

2534

252C

251C

2500

253C

255E

255F

255A

2554

2569

2566

2560

2550

256C

2567
D_
208

2568

2564

2565

2559

2558

2552

2553

256B

256A

2518

250C

2588

2584

258C

2590

2580
E_
224
р
0440
с
0441
т
0442
у
0443
ф
0444
х
0445
ц
0446
ч
0447
ш
0448
щ
0449
ъ
044A
ы
044B
ь
044C
э
044D
ю
044E
я
044F
F_
240
Ё
0401
ё
0451
Є
0404
є
0454
Ї
0407
ї
0457
Ў
040E
ў
045E
°
00B0

2219
·
00B7

221A

2116
¤
00A4

25A0
NBSP
00A0
  Changed from Alternative code page.

Variants[edit]

There existed a few variants of the code page, but the differences were mostly in the last 16 code points (240–255).

Alternative code page[edit]

The original version of the code page by Bryabrin et al. (1986)[4] is called the "Alternative code page" (Russian: Альтернативная кодировка), to distinguish it from the "Main code page" (Russian: Основная кодировка) by the same authors. It supports only Russian and Bulgarian. It is mostly the same as code page 866, except for codes F2hex through F7hex (which code page 866 changes to Ukrainian and Belarusian letters) and codes F8hex through FBhex (where code page 866 matches code page 437 instead). The differing row is shown below.

Alternative code page
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
F_
240
Ё
0401
ё
0451

256D

2264

256F

2570

2192

2190

2193

2191
÷
00F7
±
00B1}

2116
¤
00A4

25A0
NBSP
00A0
  Characters later changed by code page 866.

Modified code page 866[edit]

An unofficial variant with code points 240–255 identical to code page 437. However, the letter Ёё is usually placed at 240 and 241.[9] This version supports only Russian and Bulgarian. The differing row is shown below.

Modified CP 866
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
F_
240
Ё
0401
ё
0451

2265

2264

2320

2321
÷
00F7

2248
°
00B0

2219
·
00B7

221A

207F
²
00B2

25A0
NBSP
00A0
  Different from compliant code page 866 to match OEM-US.

Lithuanian variants[edit]

The "KBL" code page, unofficially known as Code page 771,[10] mostly matches code page 866 and the Alternative code page, but replaces the last row and some block characters with letters from the Lithuanian alphabet not otherwise present in ASCII. The Russian Ё/ё is not supported, similarly to KOI-7.

"KBL" (Code page 771)[11]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
D_
208

2568

2564

2565

2559

2558

2552

2553

256B

256A

2518

250C

2588
Ą
0104
ą
0105
Č
010C
č
010D
E_
224
р
0440
с
0441
т
0442
у
0443
ф
0444
х
0445
ц
0446
ч
0447
ш
0448
щ
0449
ъ
044A
ы
044B
ь
044C
э
044D
ю
044E
я
044F
F_
240
Ę
0118
ę
0119
Ė
0116
ė
0117
Į
012E
į
012F
Š
0160
š
0161
Ų
0172
ų
0173
Ū
016A
ū
016B
Ž
017D
ž
017E

25A0
NBSP
00A0
  Different from code page 866 and Alternative code page.

Lithuanian Standard LST 1284:1993, known as Code page 1119 or unofficially as Code page 772,[10] mostly matches the "modified" Code page 866, except for the addition of quotation marks in the last row and the replacement of the mixed single-double box drawing characters with Lithuanian letters (compare code page 850). It was later superseded by LST 1590-1 (Code page 775).[10] Unlike "KBL", the Russian Ё/ё is retained.

LST 1284:1993 (Code page 772 / 1119)[12]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
B_
176

2591

2592

2593

2502

2524
Ą
0104
Č
010C
Ę
0118
Ė
0116

2563

2551

2557

255D
Į
012E
Š
0160

2510
C_
192

2514

2534

252C

251C

2500

253C
Ų
0172
Ū
016A

255A

2554

2569

2566

2560

2550

256C
Ž
017D
D_
208
ą
0105
č
010D
ę
0119
ė
0117
į
012F
š
0161
ų
0173
ū
016B
ž
017E

2518

250C

2588

2584

258C

2590

2580
E_
224
р
0440
с
0441
т
0442
у
0443
ф
0444
х
0445
ц
0446
ч
0447
ш
0448
щ
0449
ъ
044A
ы
044B
ь
044C
э
044D
ю
044E
я
044F
F_
240
Ё
0401
ё
0451

2265

2264

201E

201C
÷
00F7

2248
°
00B0

2219
·
00B7

221A

207F
²
00B2

25A0
NBSP
00A0
  Different from "modified" code page 866.

Ukrainian and Belarusian variants[edit]

Code page 1125 matches the original Alternative code page for all points except for F2hex through F9hex inclusive, which are replaced with Ukrainian letters.[13][14] Code page 1131 matches code page 866 for all points except for F8hex, F9hex, and FChex through FEhex inclusive, which are replaced with otherwise-missing Ukrainian and Belarusian letters, in the process displacing the bullet character (∙) from F9hex to FEhex.[15][16] The differing rows are shown below.

IBM code page 1125 (Ukrainian standard RST 2018-91)[17]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
F_
240
Ё
0401
ё
0451
Ґ
0490
ґ
0491
Є
0404
є
0454
І
0406
і
0456
Ї
0407
ї
0457
÷
00F7
±
00B1}

2116
¤
00A4

25A0
NBSP
00A0
  Code page 1125 different from Alternative code page.
IBM code page 1131 (Belarusian)
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
F_
240
Ё
0401
ё
0451
Є
0404
є
0454
Ї
0407
ї
0457
Ў
040E
ў
045E
І
0406
і
0456
·
00B7
¤
00A4
Ґ
0490
ґ
0491

2219
NBSP
00A0
  Code page 1131 different from code page 866.

Euro sign updates[edit]

IBM code page 808 is a variant of code page 866; the only difference is the euro sign (€, U+20AC) in position FDhex (253) replacing the universal currency sign (¤, U+00A4).[18][19] IBM code page 848 is a variant of code page 1125 which replaces the universal currency sign at FBhex with the euro sign, analogously to the relationship of code page 808 to 866.[20][21] IBM code page 849 is a variant of code page 1131 which replaces the universal currency sign at FDhex with the euro sign, as in code page 808.[22][23]

GOST R 34.303-92[edit]

The GOST R 34.303-92 standard defines two variants. The more extensive variant, KOI-8 N2 (but not to be confused with the KOI-8 encoding, which it does not follow), matches code page 866 and the Alternative code page until the last row (codes 240 through 255, or F0hex through FFhex). For the last row, it supports letters for Belarusian and Ukrainian in addition to Russian, but in a layout unrelated to code page 866 or 1125. Notably, even the Russian Ё/ё (which was unchanged between the Alternative code page and code page 866) is in a different location. The differing row is shown below.

KOI-8 N2 (GOST R 34.303-92)
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
F_
240
SHY
00AD

2116
Ґ
0490
ґ
0491
Ё
0401
ё
0451
Є
0404
є
0454
І
0406
і
0456
Ї
0407
ї
0457
Ў
040E
ў
045E

25A0
NBSP
00A0
  Different from code page 866 and Alternative code page.

The other variant, KOI-8 N1, is a subset of KOI-8 N2 which omits the non-Russian Cyrillic letters and mixed single/double lined box drawing characters, leaving them empty for further internationalization (compare with code page 850). The affected rows are shown below.

KOI-8 N1 (GOST R 34.303-92)
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
B_
176

2591

2592

2593

2502

2524

 

 

 

 

2563

2551

2557

255D

 

 

2510
C_
192

2514

2534

252C

251C

2500

253C

 

 

255A

2554

2569

2566

2560

2550

256C

 
D_
208

 

 

 

 

 

 

 

 

 

2518

250C

2588

2584

258C

2590

2580
E_
224
р
0440
с
0441
т
0442
у
0443
ф
0444
х
0445
ц
0446
ч
0447
ш
0448
щ
0449
ъ
044A
ы
044B
ь
044C
э
044D
ю
044E
я
044F
F_
240
SHY
00AD

2116

 

 
Ё
0401
ё
0451

 

 

 

 

 

 

 

 

25A0
NBSP
00A0

Lehner–Czech modification[edit]

An unofficial modification used in software developed by Michael Lehner and Peter R. Czech. It replaces three mathematical symbols with guillemets and the section sign which are commonly used in the Russian language. (Lehner and Czech created a number of alternative character sets for other European languages as well, including one based on CWI-2 for Hungarian, a Kamenicky-based one for Czech and Slovak, a Mazovia variant for Polish and a seemingly-unique encoding for Lithuanian. The modified row is shown below.

Lehner–Czech modification
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
F_
240
Ё
0401
ё
0451
Є
0404
є
0454
Ї
0407
ї
0457
Ў
040E
ў
045E
»
00BB
«
00AB
·
00B7
§
00A7

2116
¤
00A4

25A0
NBSP
00A0
  Different from code page 866.

FreeDOS[edit]

FreeDOS provides additional unofficial extensions of code page 866 for various non-Slavic languages:[24]

Code page 900[edit]

Before Microsoft's final code page for Russian MS-DOS 4.01 was registered with IBM by Franz Rau of Microsoft as CP866 in January 1990, draft versions of it developed by Yuri Starikov (Юрий Стариков) of Dialogue were still called code page 900 internally. While the documentation was corrected to reflect the new name before the release of the product, sketches of earlier draft versions still named code page 900 and without Ukrainian and Belarusian letters, which had been added in autumn 1989, were published in the Russian press in 1990.[25] Code page 900 slipped through into the distribution of the Russian MS-DOS 5.0 LCD.CPI codepage information file.[26]

Footnotes[edit]

  1. ^ Includes distinctly Ukrainian and Rusyn letters Є and Ї, but no І distinct from Latin I, and implements Soviet orthography, i.e. omits Ґ. These are added in some modifications.
  2. ^ Includes uniquely Belarusian Ў, but no І distinct from Latin I (although this is added in some modifications).

References[edit]

  1. ^ a b Steele, Shawn (1996-04-24). "CP866.TXT: cp866_DOSCyrillicRussian to Unicode table". Unicode Consortium.
  2. ^ "OS/2" (in Russian). Archived from the original on 2016-08-13. Retrieved 2016-06-19.
  3. ^ "Code Pages Supported by Windows: OEM Code Pages". Go Global Development Center. Microsoft. Archived from the original on 2011-11-02. Retrieved 2011-10-11.
  4. ^ a b (in Russian) Брябрин В. М., Ландау И. Я., Неменман М. Е. О системе кодирования для персональных ЭВМ // Микропроцессорные средства и системы. — 1986. — № 4. — С. 61–64.
  5. ^ (in Russian) ГОСТ Р 34.303-92. Наборы 8-битных кодированных символов. 8-битный код обмена и обработки информации. = 8-bit coded character sets. 8-bit code for information interchange.
  6. ^ "OEM 866". Go Global Development Center. Microsoft. Archived from the original on 2012-02-04. Retrieved 2011-10-17.
  7. ^ IBM. "Code page identifiers: CP 00866". IBM Globalization. Archived from the original on 2016-03-16.
  8. ^ van Kesteren, Anne (2018-01-06). "Index index-ibm866". Encoding Standard. WHATWG.
  9. ^ (in Russian) Фигурнов В. Э. IBM PC для пользователя. — 2-е изд. — М.: 1992. — С. 279.
  10. ^ a b c "Codepages: Comprehensive list". Aivosto.
  11. ^ "771 kodų lentelė" (in Lithuanian). Likit.
  12. ^ "772 kodų lentelė" (in Lithuanian). Likit.
  13. ^ IBM. "Code page identifiers: CP 01025". IBM Globalization.[permanent dead link]
  14. ^ IBM. "Code Page 01125" (PDF). Archived from the original (PDF) on 2015-07-08.
  15. ^ IBM. "Code page identifiers: CP 01131". IBM Globalization. Archived from the original on 2016-03-17.
  16. ^ IBM. "Code Page 01131" (PDF). Archived from the original (PDF) on 2015-07-08.
  17. ^ (in Ukrainian) РСТ УРСР 2018-91. Система обробки інформації. Кодування символів української абетки 8-бітними кодами.
  18. ^ IBM. "Code page identifiers: CP 00808". IBM Globalization.[permanent dead link]
  19. ^ IBM. "Code Page 00808" (PDF). Archived from the original (PDF) on 2015-07-08.
  20. ^ IBM. "Code page identifiers: CP 00848". IBM Globalization.[permanent dead link]
  21. ^ IBM. "Code Page 00848" (PDF). Archived from the original (PDF) on 2015-07-08.
  22. ^ IBM. "Code page identifiers: CP 00849". IBM Globalization.[permanent dead link]
  23. ^ IBM. "Code Page 00849" (PDF). Archived from the original (PDF) on 2015-07-08.
  24. ^ "CPIDOS - CPX files (Code Page Information) Pack v3.0 - DOS codepages". FreeDOS.
  25. ^ Starikov, Yuri (2005-04-11). "15-летию Russian MS-DOS 4.01 посвящается" [15 Years of Russian MS-DOS 4.01] (in Russian). Archived from the original on 2016-12-04. Retrieved 2014-05-07.
  26. ^ Paul, Matthias (2001-06-10) [1995]. "Overview on DOS, OS/2, and Windows codepages" (CODEPAGE.LST file) (1.59 preliminary ed.). Archived from the original on 2016-04-20. Retrieved 2016-08-20.

Further reading[edit]

  • Kornai, Andras; Birnbaum, David J.; da Cruz, Frank; Davis, Bur; Fowler, George; Paine, Richard B.; Paperno, Slava; Simonsen, Keld J.; Thobe, Glenn E.; Vulis, Dimitri; van Wingen, Johan W. (1993-03-13). "CYRILLIC ENCODING FAQ Version 1.3". 1.3. Retrieved 2017-02-18.