On Aug 17, 2004, at 5:18 PM, Bob Ippolito wrote: > On Aug 17, 2004, at 5:11 PM, Martin v. Löwis wrote: > >> Bob Ippolito wrote: >>> How would you embed raw bytes if the string was unicode? >> >> The most direct notation would be >> >> bytes("delimited packet\x00") >> >> However, people might not understand what is happening, and >> Guido doesn't like it if the bytes are >127. > > I guess that was a bad example, what if the delimiter was \xff? Indeed, if all strings are unicode, the question becomes: what encoding does bytes() use to translate unicode characters to bytes. Two alternatives have been proposed so far: 1) ASCII (translate chars as their codepoint if < 128, else error) 2) ISO-8859-1 (translate chars as their codepoint if < 256, else error) I think I'd choose #2, myself. > I know that map(ord, u'delimited packet\xff') would get correct > results.. but I don't think I like that either. Why would you consider that wrong? ord(u'\xff') *should* return 255. Just as ord(u'\u1000') returns 4096. There's nothing mysterious there. James
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4