A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2011-March/109363.html below:

[Python-Dev] Finally switch urllib.parse to RFC3986 semantics?

[Python-Dev] Finally switch urllib.parse to RFC3986 semantics?Nick Coghlan ncoghlan at gmail.com
Wed Mar 16 13:02:21 CET 2011
On Tue, Mar 15, 2011 at 11:34 PM, Guido van Rossum <guido at python.org> wrote:
>
> Can you be specific? What is different between those RFCs?

I finally got around to trying to backport some of the additional
urljoin tests from http://bugs.python.org/issue1500504 (specifically,
the additional ones Mike Brown provided), but got tripped up by the
behavioural changes between the earlier RFCs and RFC 3986 regarding
the way ".." is handled.

Even in test_urlparse, a bunch of the normative tests from RFC 3986
are commented out because they fail (by design) when run through
urllib.parse.urljoin. Some of the additional tests also fail because
our urljoin implementation has a whitelist of schemas that support
relative references, whereas 3986 expects relative references to work
for unknown schemas as well.

There's actually quite a few more terminology changes as well (as
Senthil pointed out in his email), but it was specifically the failing
test cases for urljoin semantics that bit me again yesterday.

The problem is that it is quite a lot of work to get fully general URI
parsing to work correctly, but the overlap with legacy URL parsing is
large enough that many (most?) use cases in practice work just fine
with the older RFC semantics.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4