Moshe Zadka wrote: > > I'd much prefer Python to reflect a > fundamental truth about Unicode, which at least makes sure binary-goop can > pass through Unicode and remain unharmed, then to reflect a nasty problem > with UTF-8 (not everything is legal). Let's not do the same mistake again: Unicode objects should *not* be used to hold binary data. Please use buffers instead. BTW, I think that this behaviour should be changed: >>> buffer('binary') + 'data' 'binarydata' while: >>> 'data' + buffer('binary') Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: illegal argument type for built-in operation IMHO, buffer objects should never coerce to strings, but instead return a buffer object holding the combined contents. The same applies to slicing buffer objects: >>> buffer('binary')[2:5] 'nar' should prefereably be buffer('nar'). -- Hmm, perhaps we need something like a data string object to get this 100% right ?! >>> d = data("...data...") or >>> d = d"...data..." >>> print type(d) <type 'data'> >>> 'string' + d d"string...data..." >>> u'string' + d d"s\000t\000r\000i\000n\000g\000...data..." >>> d[:5] d"...da" etc. Ideally, string and Unicode objects would then be subclasses of this type in Py3K. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4