A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://en.wikipedia.org/wiki/KOI8-U below:

KOI8-U - Wikipedia

From Wikipedia, the free encyclopedia

Character encoding for Ukrainian Cyrillic

KOI8-U (RFC 2319) is an 8-bit character encoding, designed to cover Ukrainian, which uses a Cyrillic alphabet. It is based on KOI8-R, which covers Russian and Bulgarian, but replaces eight box drawing characters with four Ukrainian letters Ґ, Є, І, and Ї in both upper case and lower case.

KOI8-RU is closely related, but adds Ў for Belarusian. In both, the letter allocations match those in KOI8-E, except for Ґ which is added to KOI8-F.

In Microsoft Windows, KOI8-U is assigned the code page number 21866. In IBM, KOI8-U is assigned code page/CCSID 1168.[1][2][3]

KOI8 remains much more commonly used than ISO 8859-5, which never really caught on.[citation needed] Another common Cyrillic character encoding is Windows-1251. In the future, both may eventually give way to Unicode.

KOI8 stands for Kod Obmena Informatsiey, 8 bit (Russian: Код Обмена Информацией, 8 бит) which means "Code for Information Exchange, 8 bit".

The KOI8 character sets have the property that the Cyrillic letters are in pseudo-Latin alphabetic order rather than Cyrillic alphabetical order as in ISO 8859-5. This has the useful effect that if the eighth bit is stripped and the text is presented in any character set based on ASCII including the KOI8 sets themselves, the text is still reasonably human readable as a case-reversed transliteration. For instance, the "KOI" acronym "Код Обмена Информацией" becomes kOD oBMENA iNFORMACIEJ.

The following table shows the KOI8-U encoding.[1][4] Each character is shown with its equivalent Unicode code point.

KOI8-U 0 1 2 3 4 5 6 7 8 9 A B C D E F 0x 1x 2x  SP  ! " # $ % & ' ( ) * + , - . / 3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ? 4x @ A B C D E F G H I J K L M N O 5x P Q R S T U V W X Y Z [ \ ] ^ _ 6x ` a b c d e f g h i j k l m n o 7x p q r s t u v w x y z { | } ~ 8x ─
2500
2502
250C
2510
2514
2518
251C
2524
252C
2534
253C
2580
2584
2588
258C
2590 9x ░
2591
2592
2593
2320
25A0
2219
221A
2248
2264
2265 NBSP
2321 °
00B0 ²
00B2 ·
00B7 ÷
00F7 Ax ═
2550
2551
2552 ё
0451 є
0454
2554 і
0456 ї
0457
2557
2558
2559
255A
255B ґ
0491
255D
255E Bx ╟
255F
2560
2561 Ё
0401 Є
0404
2563 І
0406 Ї
0407
2566
2567
2568
2569
256A Ґ
0490
256C ©
00A9 Cx ю
044E а
0430 б
0431 ц
0446 д
0434 е
0435 ф
0444 г
0433 х
0445 и
0438 й
0439 к
043A л
043B м
043C н
043D о
043E Dx п
043F я
044F р
0440 с
0441 т
0442 у
0443 ж
0436 в
0432 ь
044C ы
044B з
0437 ш
0448 э
044D щ
0449 ч
0447 ъ
044A Ex Ю
042E А
0410 Б
0411 Ц
0426 Д
0414 Е
0415 Ф
0424 Г
0413 Х
0425 И
0418 Й
0419 К
041A Л
041B М
041C Н
041D О
041E Fx П
041F Я
042F Р
0420 С
0421 Т
0422 У
0423 Ж
0416 В
0412 Ь
042C Ы
042B З
0417 Ш
0428 Э
042D Щ
0429 Ч
0427 Ъ
042A  

Differences with

KOI8-R

(non-Russian letters)

Although RFC 2319 says that character 0x95 should be U+2219 (∙), it may also be U+2022 (•) to match the bullet character in Windows-1251.

Some references have a typo and incorrectly state that character 0xB4 is U+0403, rather than the correct U+0404. This typo is present in Appendix A of RFC 2319 (but the table in the main text of the RFC gives the correct mapping).


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4