On 01/18/2014 05:48 AM, Nick Coghlan wrote: > On 18 Jan 2014 11:52, "Ethan Furman" wrote: >> >> I'll admit to being somewhat on the fence about %a. >> >> It seems there are two possibilities with %a: >> >> 1) have it be ascii(repr(obj)) >> >> 2) have it be str(obj).encode('ascii', 'strict') > > This gets very close to crossing the line into implicit encoding of text again. Binary interpolation is being added back > for the specific use case of working with ASCII compatible segments in binary formats, and it's at best arguable that > supporting %a will help with that use case. Agreed. > However, without it, there may be a greater temptation to inappropriately define __bytes__ just to support binary > interpolation, rather than because a type truly has an appropriate translation directly to bytes. True. > By allowing %a, we avoid that temptation. This is also potentially useful specifically in the case of binary logging > formats and as a quick way to request backslash escaping of non-ASCII characters in text. > > Call it +0.5 for allowing %a. I don't expect it to be used heavily, but I think it will head off a fair bit of potential > misuse of __bytes__. So, if %a is added it would act like: --------- "%a" % some_obj --------- tmp = str(some_obj) res = b'' for ch in tmp: if ord(ch) < 256: res += bytes([ord(ch)] else: res += unicode_escape(ch) --------- where 'unicode_escape' would yield something like "\u0440" ? -- ~Ethan~
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4