>[Just] >> You're going to have a hard time explaining that "\377" != u"\377". > [GvR] >I agree. You are an example of how hard it is to explain: you still >don't understand that for a person using CJK encodings this is in fact >the truth. That depends on the definition of truth: it you document that 8-bit strings are Latin-1, the above is the truth. Conceptually classify all other 8-bit encodings as binary goop makes the semantics chrystal clear. >> Again, if you define that "all strings are unicode" and that 8-bit strings >> contain Unicode characters up to 255, you're all set. Clear semantics, few >> surprises, simple implementation, etc. etc. > >But not all 8-bit strings occurring in programs are Unicode. Ask >Moshe. I know. They can be anything, even binary goop. But that's *only* an artifact of the fact that 8-bit strings need to double as buffer objects. Just
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4