> Unicode 5.0, Chapter 3, verse C9: > > When a process generates a code unit sequence which purports to be > in a Unicode character encoding form, it shall not emit ill-formed > code sequences. > > > A Unicode-conforming Python implementation would error at the > > > chr() call, or perhaps would not provide surrogateescape error > > > handlers. > > > > Chapter and verse? > > Chapter 3, verse C9 again. I agree that the surrogateescape error handler is non-conforming, but, as you say, it doesn't claim to, either (would your concern about utf-8 being misleading here been resolved if the thing had been called "utf-8b"?) More interestingly (and to the subject) is chr: how did you arrive at C9 banning Python3's definition of chr? This chr function puts the code sequence into well-formed UTF-16; that's the whole point of UTF-16. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4