A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2006-July/067285.html below:

[Python-Dev] urllib.quote and unicode bug resuscitation attempt

[Python-Dev] urllib.quote and unicode bug resuscitation attempt"Martin v. Löwis" martin at v.loewis.de
Tue Jul 11 23:16:21 CEST 2006
Stefan Rank wrote:
> I suggest to add (after 2.5 I assume) one of the following to the 
> beginning of urllib.quote to either fail early and consistently on 
> unicode arguments and improve the error message::
> 
>    if isinstance(s, unicode):
>        raise TypeError("quote needs a byte string argument, not unicode,"
>                        " use `argument.encode('utf-8')` first.")
> 
> or to do The Right Thing (tm), which is utf-8 encoding::

The right thing to do is IRIs. This is more complicated than encoding
the Unicode string as UTF-8, though: for the host part of the URL, you
have to encode it with IDNA (and there are additional complicated rules
in place, e.g. when the Unicode string already contains %).

Contributions are welcome, as long as they fix this entire issue "for
good" (i.e. in all URL-processing code, and considering all relevant
RFCs).

Regards,
Martin
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4