On 08/09/2011 03:46, Stephen J. Turnbull wrote: > Glyph Lefkowitz writes: > > On Sep 7, 2011, at 10:26 AM, Stephen J. Turnbull wrote: > > > > > How about "title"? > > > > >>> 'content-length'.title() > > 'Content-Length' > > Does anyone *actually* use .title() for this? (And why not just use the correct casing in the string literal...) Michael > > You might say that the protocol "has" to be case-insensitive so > > this is a silly frill: > > Not me, sir. My whole point about the "bytes should be more like str" > controversy is the dual of that: you don't know what will be coming at > you, so the regularities and (normally allowable) fuzziness of text > processing are inadmissible. > > > there are definitely enough case-sensitive crappy bits of network > > middleware out there that this function is critically important for > > an HTTP server. > > "Critically important" is surely an overstatement. You could always > title-case the literal strings containing field names in the source. > > The problem with having lots of str-like features on bytes is that you > lose TOOWDTI, or worse, to many performance-happy coders, use of bytes > becomes TOOWDTI "because none of the characters[sic] I'm planning to > process myself are non-ASCII". This is the road to Babel; it's > workable for one-off scripts but it's asking for long-term trouble in > multi-module applications. The choice of decoding to str and > processing in that form should be made as attractive as possible. > > On the other hand, it is undeniably useful for protocol tokens to have > mnemonic representations even in binary protocols. Textual > manipulations on those tokens should be convenient. > > It seems to me that what might be an improvement over the current > situation (maybe for Py4k only, though) is for bytes and > (PEP-393-style) str to share representation, and have a "cast" method > which would convert from one to the other, validating that the range > contraints on the representation are satisfied. The problem I see is > that this either sanctions the practice of using latin-1 as "ASCII > plus anything", which is an unpleasant hack, or you'd need to check in > text methods that nothing is done with non-ASCII values other than > checks for set membership (including equality comparison, of course). > > OTOH, AFAICS, Antoine's claim that inserting a non-latin-1 character > in a str that happens to contain only ASCII values would convert the > representation to multioctets (true), and therefore this doesn't give > the desired efficiency properties, is beside the point. Just don't do > that! You *can't* do that in a bytes object, anyway; use of str in > this way is a "consenting adults" issue. You trade off the > convenience of the full suite of text tools vs. the possibility that > somebody might insert such a character -- but for the algorithms > they're going to be using, they shouldn't be doing that anyway. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4