"Martin v. Loewis" wrote: > > > That's because the buffer interface on Unicode objects doesn't > > return the raw binary buffer. If you pass in a memory mapped > > file or a buffer object wrapping some memory area, u# will > > take the input as raw binary stream. > > > > All this weird behaviour is needed to make Unicode objects > > behave well together with s#. > > I don't believe this. Why would the implementation of u# have any > effect on making s# work? To make s# work, we had to map the read buffer interface to the encoded version of Unicode -- not the binary version which would have been the "right" choice in terms of the buffer interface (s# maps to the read buffer interface, while t# maps to the character buffer interface). u# is simply a copy&paste implementation of s# interpreting the results of the read buffer interface as Py_UNICODE array. As I menioned in another mail, we should probably let u# pass through Unicode objects as-is without going through the read buffer interface. This functionality is clearly missing and should be added to make u# useful. > > Jack will probably also need a way to say "decode this encoded > > object into Unicode using the encoding xyz". Something like the > > Unicode version of "es#". How about "eu#" which then passes through > > Unicode as-is while decoding all other objects according to the > > given encoding ?! > > I'd like to see the requirements, in terms of real-world problems, > before considering any extensions. Agreed. Jack should post some examples of what he needs for his application. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4