On 1/13/2014 1:40 PM, Brett Cannon wrote: > > So bytes formatting really needn't (and shouldn't, IMO) mirror str > > formatting. This was my presumption in writing byteformat(). > I think one of the things about Guido's proposal that bugs me is that it > breaks the mental model of the .format() method from str in terms of how > the mini-language works. For str.format() you have the conversion and > the format spec (e.g. "{!r}" and "{:d}", respectively). You apply the > conversion by calling the appropriate built-in, e.g. 'r' calls repr(). > The format spec semantically gets passed with the object to format() > which calls the object's __format__() method: ``format(number, 'd')``. > > Now Guido's suggestion has two parts that affect the mini-language for > .format(). One is that for bytes.format() the default conversion is > bytes() instead of str(), which is fine (probably want to add 'b' as a > conversion value as well to be consistent). But the other bit is that > the format spec goes from semantically meaning ``format(thing, > format_spec)`` to ``format(thing, format_spec).encode('ascii', > 'strict')`` for at least numbers. That implicitness bugs me as I have > always thought of format specs just leading to a call to format(). I > think I can live with it, though, as long as it is **consistently** > applied across the board for bytes.format(); every use of a format spec > leads to calling ``format(thing, format_spec).encode('ascii', > 'strict')`` no matter what type 'thing' would be and it is clearly > documented that this is done to ease porting and handle the common case > then I can live with it. This is how my byteformat function works, except that when no format_spec is given, byte and bytearrary objects are left unchanged rather than being decoded and encoded again. > This even gives people in-place ASCII encoding for strings by always > using '{:s}' with text which they can do when they port their code to > run under both Python 2 and 3. So you should be able to do > ``b'Content-Type: {:s}'.format('image/jpeg')`` and have it give ASCII. > If you want more explicit encoding to latin-1 then you need to do it > explicitly and not rely on the mini-language to do tricks for you. > > IOW I want to treat the format mini-language as a language and thus not > have any special-casing or massive shifts in meaning between > str.format() and bytes.format() so my mental model doesn't have to > contort based on whether it's str or bytes. My preference is not have > any, but if Guido is going say PBP here then I want absolute consistency > across the board in how bytes.format() tweaks things. > > As for %s for the % operator calling ascii(), I think that will be a > porting nightmare of finding out why your bytes suddenly stopped being > formatted properly and then having to crawl through all of your code for > that one use of %s which is getting bytes in. By raising a TypeError you > will very easily detect where your screw-up occurred thanks to the > traceback; do so otherwise feels too much like implicit type conversion > and ask any JavaScript developer how that can be a bad thing. I personally would not add 'bytes % whatever'. -- Terry Jan Reedy
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4