> [leaving only one example in:] > > version = "HTTP/0.9" > > status = "200" > > reason = "" > > > > Protocol elements, thus byte string. > > I think you're taking it too far now. I think we should assume that > ASCII survives. That is not the issue. That string *is* a byte string. The HTTP protocol is not defined in terms of character sequences, but in terms of byte sequences, or else interoperability would be lost. If those strings would converted to character strings (i.e. Unicode strings), it would still work, but it won't be correct anymore. That's just like giving a file size as a double: it would probably work, but it won't be correct. > Also, as these things are readable they should be treated as such. It > should be possible to do > >>> print u"Funny reply to my "+unicode(version)+u" message" > especially when the "funny reply" bit is in Japanese. That is a nice property of so-called "text" protocols. That still doesn't make it a character-oriented protocol; HTTP *is* a byte oriented protocol. If you have a binary protocol, there is likely also a version field in it, but you'd have to write print u"Funny reply to my "+XDRversion2string(version)+u" message" > What I would agree with, I think, is if we tag these strings as > "ascii". That is pointless. Having strings tagged with their encoding is also a possible architecture for a programming language, but none that Python has chosen to take. Instead, Python has selected to have only a single data type for character data, namely Unicode. > Python sourcecode is ASCII, and if you put 8 bit characters in there > you're living dangerously. [...] > Only when octal or hex escapes appear in a sourcecode string can it be > anything other than ascii. The octal escapes, in themselves, are also ASCII, or else you could not put them into source code. The traditional string type in Python really is a byte string type first of all. It can be used as a character string type only if you imply a character set and an encoding. The source being ASCII just gives you a guarantee about the bytes you get at runtime. Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4