A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://mail.python.org/pipermail/python-dev/2008-May/079227.html below:

[Python-Dev] urllib unicode handling

[Python-Dev] urllib unicode handling [Python-Dev] urllib unicode handling"Martin v. Löwis" martin at v.loewis.de
Wed May 7 21:06:00 CEST 2008
> Maybe I didn't understand the RFC quite right, but it seemed like how to
> handle hostnames was left as a choice between IDNA encoding the hostname
> or replacing the non-ascii characters with dashes? I guess in practice
> IDNA is the right decision.

I haven't fully understood it, either, but I think that's the right
conclusion. People want to fetch the resource, then, and encoding the
host name in UTF-8 won't do much good.

> Seems like the other somewhat under-specified part of all of this is how
> urllib.unquote() should work. If after percent decoding it sees
> non-ascii octets, should it try to decode them as utf-8 and if that
> fails then leave them as is?

That's why I think that using IRIs should be a separate feature,
perhaps a separate module entirely.

Regards,
Martin
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4