"Martin v. Loewis" wrote: > > > True; "u#" does exactly the same as "s#" -- it interprets the > > input as binary buffer. > > It doesn't do exactly the same. If s# is applied to a Unicode object, > it transparently invokes the default encoding, which is sensible. If > u# is applied to a byte string, it does not apply the default encoding. That's because the buffer interface on Unicode objects doesn't return the raw binary buffer. If you pass in a memory mapped file or a buffer object wrapping some memory area, u# will take the input as raw binary stream. All this weird behaviour is needed to make Unicode objects behave well together with s#. The implementation of u# is completely symmetric to that of s# though. I agree, though, that it would make more sense to special case Unicode objects here and have u# return a pointer to the raw internal buffer of the Unicode object. Jack will probably also need a way to say "decode this encoded object into Unicode using the encoding xyz". Something like the Unicode version of "es#". How about "eu#" which then passes through Unicode as-is while decoding all other objects according to the given encoding ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4