A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-dev/2008-May/079383.html below:

[Python-Dev] Copying cgi.parse_qs() to the urllib.parse module

[Python-Dev] Copying cgi.parse_qs() to the urllib.parse moduleTom Pinckney thomaspinckney3 at gmail.com
Mon May 12 22:58:47 CEST 2008
Is there any thought to extending escape to escape / unescape to by  
default handle characters other than <, >, and &? At a minimum it  
should handle arbitrary &xxx; values. Ideally, it would also handle  
common other symbolic names besides &lt; &gt; etc.

HTML from common web sites such as nytimes.com frequently has a  
variety of characters escaped.

Consider the page at http://travel.nytimes.com/travel/guides/europe/france/provence-and-the-french-riviera/overview.html

It lists its content type as:
content="text/html; charset=UTF-8"
And contains text like:
There&#146;s the C&ocirc;te d&#146;
Ideally, we would decode &#146 into ’ and &ocirc into ô.
Unfortunately, #146 is really an error -- it's not a utf-8 encoded  
unicode character but really a MS codepage 1252 character for  
apostrophe (apparently may HTML editing systems intermingle unicode  
and codepage 1252 content for apostrophes and a few other common  
characters).
I'm happy to contribute some additional code for these other cases if  
people agree it's useful.



On May 12, 2008, at 10:36 AM, Tony Nelson wrote:

> At 11:56 PM -0400 5/10/08, Fred Drake wrote:
>> On May 10, 2008, at 11:49 PM, Guido van Rossum wrote:
>>> Works for me. The other thing I always use from cgi is escape() --
>>> will that be available somewhere else too?
>>
>>
>> xml.sax.saxutils.escape() would be an appropriate replacement, though
>> the location is a little funky.
>
> At least it's right next to the valuable quoteattr().
> -- 
> ____________________________________________________________________
> TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
>      '                              <http://www.georgeanelson.com/>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/thomaspinckney3%40gmail.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20080512/3c819faa/attachment.htm>
More information about the Python-Dev mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4