RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://mail.python.org/pipermail/python-dev/2002-August/027336.html below:

[Python-Dev] Re: [ python-Patches-590682 ] New codecs: html, asciihtml

[Python-Dev] Re: [ python-Patches-590682 ] New codecs: html, asciihtmlFredrik Lundh fredrik@pythonware.com
Mon, 5 Aug 2002 15:57:10 +0200

Previous message: [Python-Dev] Re: [ python-Patches-590682 ] New codecs: html, asciihtml
Next message: [Python-Dev] Re: [ python-Patches-590682 ] New codecs: html, asciihtml
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Oren Tirosh wrote:

> In its current form I find htmlentitydefs.py pretty useless.

I use it a lot, and find it reasonably useful.  sure beats typing in
the HTML character tables myself, or writing a DTD parser.

> Names in the input in arbitrary case will not match the MixedCase
> keys in the entitydefs dictionary

people who use oddball characters may prefer to keep uppercase
letters separate from lowercase letters.  if I type "Link=F6ping" using
a named entity, I don't want it to come out as "Link=D6ping".

if you don't care, nothing stops you from using  the "lower" string
method.

> and the decimal character reference isn't really more useful than
> the named entity reference.

really?  converting a decimal character reference to a unicode character
is trivial, but how do you convert a named entity reference to a unicode
character?  (look it up in the htmlentitydefs?)

here's a trivial piece of code that converts the entitydefs dictionary to
a entity->unicode mapping:

    entitydefs_unicode =3D {}
    for entity, char in entitydefs.items():
        if char[:2] =3D=3D "&#":
            char =3D unichr(int(char[2:-1]))
        else:
            char =3D unicode(char, "iso-8859-1")
        entitydefs_unicode[entity] =3D char

</F>

Previous message: [Python-Dev] Re: [ python-Patches-590682 ] New codecs: html, asciihtml
Next message: [Python-Dev] Re: [ python-Patches-590682 ] New codecs: html, asciihtml
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4