Fredrik Lundh replied to himself in c.l.py: >> as far as I can tell, it's supposed to be a feature. >> >> if you mix 8-bit strings with unicode strings, python 1.6a2 >> attempts to interpret the 8-bit string as an utf-8 encoded >> unicode string. >> >> but yes, I also think it's a bug. but this far, my attempts >> to get someone else to fix it has failed. might have to do >> it myself... ;-) > >postscript: the powers-that-be has decided that this is not >a bug. if you thought that strings were just sequences of >characters, just as in Perl and Tcl, you're in for one big >surprise in Python 1.6... I just read the last few posts of the powers-that-be-list on this subject (Thanks to Christian for pointing out the archives in c.l.py ;-), and I must say I completely agree with Fredrik. The current situation sucks. A string should always be a sequence of characters. A utf-8-encoded 8-bit string in Python is *not* a string, but a "ByteArray". An 8-bit string should never be assumed to be utf-8 because of that distinction. (The default encoding for the builtin unicode() function may be another story.) Just
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4