Fredrik Lundh wrote: > > mal wrote: > > > > > talking about string methods: how about providing an > > > > > "encode" method for 8-bit strings too? > > > > > > > > I've tossed that idea around a few times too... it could > > > > have the same interface as the Unicode one (without default > > > > encoding though). The only problem is that there are currently > > > > no codecs which could deal with strings on input. > > > > > > imho, a consistent interface is more important than a truly > > > optimal implementation (string are strings yada yada). or in > > > other words, > > > > > > def encode(self, encoding): > > > if encoding is close enough: > > > return self > > > return unicode(self).encode(encoding) > > > > > > ought to be good enough for now. > > /snip/ > > > Note that 'abc'.encode('utf8') would fail because the UTF-8 > > codec expects Unicod on input to its encode method (hmm, perhaps > > I ought to make the method use the 'u' parser marker instead > > of 'U' -- that way, the method would auto-convert the 'abc' > > string to Unicode using the default encoding and then proceed > > to encode it in UTF-8). > > sorry, I wasn't clear: the "def encode" snippet above should be > a string method, not a function. > > "abc".encode("utf8") would be "return self" if the default encoding > is "ascii" or "utf8", and "return unicode("abc").encode("utf8")" other- > wise. I've just checked in modifications to the builtin codecs which allow them to accept 8-bit strings too. They will convert the strings to Unicode and then encode them as usual. So given that the .encode() method gets accepted (I haven't heard any other opinions yet), "abc".encode("utf8") will work just like any other builtin codec (the 8-bit string will be interpreted under the default encoding assumption). -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4