Tres Seaver <tseaver at palladion.com> wrote: > Stephen J. Turnbull wrote: > > > We do need str-based implementations of modules like urllib. > > Why would that be? URLs aren't text, and never will be. The fact that > to the eye they may seem to be text-ish doesn't make them text. This URLs are exactly text (strings, representable as Unicode strings in Py3K), and were designed as such from the start. The fact that some of the things tunneled or carried in URLs are string representations of non-string data shouldn't obscure that point. They're not "text-ish", they're text. They're not opaque, either; they break down in well-specified ways, mainly into strings. The trouble comes in when we try to go beyond the spec, or handle things that don't conform to the spec. Sure, a path component of a URI might actually be a %-escaped sequence of arbitrary bytes, even bytes that don't represent a string in any known encoding, but that's only *after* reversing the %-escapes, which should happen in a scheme-specific piece of code, not in generic URL parsing or manipulation. Bill
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4