On 01/12/2014 09:26 AM, Paul Moore wrote: > On 12 January 2014 17:03, Ethan Furman <ethan at stoneleaf.us> wrote: >> We know full well the difference between unicode and bytes, and we know full >> well that numbers and much of the text we need has an ASCII (bytes!) >> representation. When we do a b'Content Length: %d' % len(binary_data) we >> are expecting to get back a bytes object, /not/ a unicode object. > > What I am struggling to understand here is what room for compromise > there is. Clearly, for whatever reason, > > b'Content Length: ' + str(len(binary_data)).encode('ascii')) > > is not acceptable for you. OK, fair enough. Also, apparently, writing a helper > > def int_to_bytes(n): > return str(n).encode('ascii') > > b'Content Length: ' + int_to_bytes(len(binary_data)) > > is unacceptable. But I'm not clear why it's unacceptable. Maybe I > missed the explanation - God knows, the thread is long enough :-) True enough! ;) It's unacceptable in the sense that the bytes type is /almost/ there, it's /almost/ what is needed to handle the boundary conditions. We have a __bytes__ method (how is it supposed to be used?) that could be made to fit the interpolation bill. It seems to me the core of Nick's refusal is the (and I agree!) rejection of bytes interpolation returning unicode -- but that's not what I'm asking for! I'm asking for it to return bytes, with the interpolated data (in the case if %d, %s, etc) being strictly-ASCII encoded. > On the other hand, Nick has explained why b'Content Length: %d' % > len(binary_data) is unacceptable to him (you don't have to agree with > his opinion, just concede that he has explained his position in a way > that you understand). Only because he (or Benno) finally wrote some tests and I was able to see what he thought I was wanting. Which does seem to leave a *tiny* bit of wiggle room if bytes interpolation always return bytes, and never a unicode (yeah, I know, snowball's chance and all that). > I'm not trying to argue you're wrong - I don't know your codebase, nor > do I know your application area. But surely somewhere between "we must > have % formatting including %d for bytes" and the above, there's a > middle ground that you *are* willing to accept? Can you give any > indications of what that might be? What, specifically, about the > helper function is the problem? I don't think it is any less space > efficient, it doesn't double-encode, and I don't think it's more > difficult to understand (although it is a little longer, it trades > that off against being a bit more explicit as to what's going on). > Surely you're not arguing that your code must work unchanged (not > "there's a way of writing the code so it works on Python 2 and 3", but > "the code you currently have for Python 2 must work with no changes at > all")? I'm arguing from three PoVs: 1) 2 & 3 compatible code base 2) having the bytes type /be/ the boundary type 3) readable code > Can you give an example of code that is *nearly* acceptable to you, > which works in Python 2 and 3 today, and explain what improvements you > would like to see to it in order to use it instead of waiting for a > core change? I'm not trying to be difficult (just naturally good at it, I guess ;) , but I don't see a lot room for compromises -- I would like % interpolation, I'm told I have to use a helper function. I will if I have to, but first I have to try and make myself understood, and I'm not sure that has happened yet. Following Nick's example I'm writing up some tests that clearly show what I would like to see. Then at least we can debate what I'm actually asking for, and now what the (understandably) unicode-what-a-mess-we-had-in-py2k-don't-want-again that some think I am asking for. -- ~Ethan~
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4