> It'S rather common to confuse a transfer encoding with a storage format. > UCS2 and UCS4 refer to code units (the storage format). Actually, they don't. Instead, they refer to "coded character sets", in W3C terminology: mapping of characters to natural numbers. See http://unicode.org/faq/basic_q.html#14 The term "UCS-2" is a character set that can encode only encode 65536 characters; it thus refers to Unicode 1.1. According to the Unicode Consortium's FAQ, the term UCS-2 should be avoided these days. > IMO, we should go back to the Python2 terms UCS2 and UCS4 which > are correct and provide a clear description of what Python uses > internally for code units. No, we shouldn't. The term UCS-2 is deprecated, see above. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4