RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from http://mail.python.org/pipermail/python-dev/2002-June/025436.html below:

[Python-Dev] unicode() and its error argument

[Python-Dev] unicode() and its error argumentTim Peters tim.one@comcast.net
Sat, 15 Jun 2002 12:21:03 -0400

Previous message: [Python-Dev] unicode() and its error argument
Next message: [Python-Dev] unicode() and its error argument
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

[Skip Montanaro]
> The unicode() builtin accepts an optional third argument, errors, which
> defaults to "strict".  According to the docs if errors is set to "ignore",
> decoding errors are silently ignored.  I seem to still get the occasional
> UnicodeError exception, however.  I'm still trying to track down an actual
> example (it doesn't happen often, and I hadn't wrapped unicode() in a
> try/except statement, so all I saw was the error raised, not the input
> string value).

Play with this:

"""
def generrors(encoding, errors, maxlen, maxtries):
    from random import choice, randint
    bytes = [chr(i) for i in range(256)]
    paste = ''.join
    for dummy in xrange(maxtries):
        n = randint(1, maxlen)
        raw = paste([choice(bytes) for dummy in range(n)])
        try:
            u = unicode(raw, encoding, errors)
        except UnicodeError, detail:
            print 'fail w/ errors', errors, '- raw data', repr(raw)
            print '    UnicodeError', str(detail)

errors = ('strict', 'replace', 'ignore')

generrors('mac-turkish', errors[2], 10, 1000)
"""

Plug in your favorite encoding and let it do the work of finding examples.
It generates plenty of errors with 'strict', but so far I haven't seen it
generate one with 'replace' or 'ignore'.

Previous message: [Python-Dev] unicode() and its error argument
Next message: [Python-Dev] unicode() and its error argument
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4