RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://mail.python.org/pipermail/python-dev/attachments/20140113/f5fa256d/attachment.html below:

<html>
 <head>
 <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
 </head>
 <body bgcolor="#FFFFFF" text="#330033">
 <div class="moz-cite-prefix">On 1/13/2014 9:25 PM, Nick Coghlan
 wrote: 
 </div>
 <blockquote
cite="mid:CADiSq7e=kvhm5GCX-EJniezkuPw=dPBVG8jG3rSsGm2uK5TOLQ@mail.gmail.com"
 type="cite">
 <pre wrap="">since this observation makes it clear that there's *no* coherent way
to offer a pure binary interpolation API - the only general purpose
combination mechanism for segments of binary data that can avoid
making assumptions about the encodings of metacharacters is simple
concatenation.</pre>
 </blockquote>
 That's almost true, and I'm glad that you, Guido, and all of us can
 understand that the currently defined python2 and python3 formatting
 syntaxes contain an inherent ASCII assumption, just like many
 internet protocols. The bitter fight is over :) 
 
 However, your statement above isn't 100% accurate, so just for the
 pedantry of it, I'll point out why. A mechanism could be defined
 where "format string" would only contain format specifications, and
 any other text would be considered an error. The format string could
 have an explicit or a defined encoding, there would be no need to
 make an assumption about its encoding. And since it would not
 contain text except for format specifications, it would only be used
 as a rule-book on how to interpret the parameters, contributing no
 text of its own to the result. 
 
 This wouldn't solve the problem at hand, though, which is to provide
 a nice migration path from Python 2 to Python 3 for code that uses
 ASCII-based format strings that do contribute text as well as
 include parameter data. 
 
 Whether such a technique would be more useful than simple
 concatenation (or complex concatenation such as join) remains to be
 seen, and possibly discussed, if anyone is interested, but it
 probably would belong on python-ideas, since it would not address an
 immediate porting issue. 
 
 Assuming an ASCII-in-bytes format string (but with no contributed
 text to the result) one could write something like 
 
 b"%{koi7}s%{00}v%{big5}d%{00}v%{ShiftJIS}s%{0000}v%b" / ( cyrillic,
 len( blob ), japanese, blob ) 
 
 So the encodings to be applied to each of the input parameters could
 be explicitly specified. 
 
 The %{00}v stuff would be interpolated into the output... expressed
 in ASCII as hex, two characters per byte.Â Note that the number uses
 Chinese digits in the big5 encoding, but I don't know if the Chinese
 even use their own digits or ASCII ones these days, or what base
 they use, I guess it was the Babylonians that used base 60 from
 which our timekeeping and angular measures were derived. The example
 shows a null byte or two between items in the output. 
 
 So there _could be_ a coherent way to offer an interpolation
 mechanism that is pure binary, and allows selection of encoding of
 str data, if and as needed.Â One specifier could even be an encoding
 to apply to any format specifiers that don't include an encoding, so
 in the typical case of dealing with a single language output, the
 appropriate encoding could be set at the beginning of the format
 specification and overridden by particular specifiers if need be.
 But while there _could be_ such an interpolation mechanism, it isn't
 compatible with Python 2, and the jury hasn't decided whether such a
 thing is sufficiently more useful than concatenation to be worth
 implementing.Â A different operator might be required, or the whole
 thing could be a function instead of an operator, with a similar
 format specification, or one more like the minilanguage used with
 format in python 3. 
 </body>
</html>

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4