On Fri, 12 Nov 1999, M.-A. Lemburg wrote: > Fredrik Lundh wrote: >... > > why? I don't understand why "s" and "s#" has > > to deal with encoding issues at all... > > > > > unless, of course, you want to give up Unicode object support > > > for all APIs using these parsers. > > > > hmm. maybe that's exactly what I want... > > If we don't add that support, lot's of existing APIs won't > accept Unicode object instead of strings. While it could be > argued that automatic conversion to UTF-8 is not transparent > enough for the user, the other solution of using str(u) > everywhere would probably make writing Unicode-aware code a > rather clumsy task and introduce other pitfalls, since str(obj) > calls PyObject_Str() which also works on integers, floats, > etc. No no no... "s" and "s#" are NOT SUPPOSED TO return a UTF-8 encoding. They are supposed to return the raw bytes. If a caller wants 8-bit characters, then that caller will use "t#". If you want to argue for that separate, encoded buffer, then argue for it for support for the "t#" format. But do NOT say that it is needed for "s#" which simply means "give me some bytes." -g -- Greg Stein, http://www.lyra.org/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4