Nicholas Bastin wrote: > What I mean is pretty clear. UCS-2 does *NOT* support surrogate pairs. > If it did, it would be called UTF-16. If Python really supported > UCS-2, then surrogate pairs from UTF-16 inputs would either get turned > into two garbage characters, or the "I couldn't transcode this" UCS-2 > code point (I don't remember which on that is off the top of my head). OTOH, if Python really supported UTF-16, then unichr(0x10000) would work, and len(u"\U00010000") would be 1. It is primarily just the UTF-8 codec which supports UTF-16. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4