> > Having just followed this thread tangentially, I do have to say it > > seems quite cool to be able to do something like the following in > > Python 2.2: > > > > >>> s = msg['from'] > > >>> parts = s.split('?') > > >>> if parts[2].lower() == 'q': > > ... name = parts[3].decode('quopri') > > ... elif parts[2].lower() == 'b': > > ... name = parts[3].decode('base64') > > ... > > I think that the central point is that if code like the above is useful > and supported then it needs to be the same for Unicode strings as for > 8-bit strings. Why is that? An encoding, by nature, is something that produces a byte sequence from some input. So you can only decode byte sequences, not character strings. > If the code above is NOT useful and should NOT be supported then we > need to undo it before 2.2 ships. This unicode.decode argument is > just a proxy for the real argument about the above. No, it isn't. The code is useful for byte strings, but not for Unicode strings. > I don't feel strongly one way or another about this (ab?)use of the > codecs concept, myself, but I do feel strongly that Unicode strings > should behave as much as possible like 8-bit strings. Not at all. Byte strings and character strings are as different as are byte strings and lists of DOM child nodes (i.e. the only common thing is that they are sequences). Regards, Martin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4