Le mardi 13 septembre 2005 à 17:56 +0900, Hye-Shik Chang a écrit : > On 9/11/05, Victor STINNER <victor.stinner-linux at haypocalc.com> wrote: > > Hi, > > > > I found a bug in Python interactive command line (program python alone: > > looks to be code.interact() function in code.py). With UTF-8 locale, the > > command << u"é" >> returns << u'\xc3\xa9' >> and not << u'\xE9' >>. > > Remember: the french e with acute is Unicode 233 (0xE9), encoded \xC3 > > \xA9 in UTF-8. > > Which version of python do you use? From 2.4, the interactive mode > respects locale as a source code encoding and it falls back to latin-1 > when decoding fails. > > Python 2.4.1 (#2, Jul 31 2005, 04:45:53) > [GCC 3.4.2 [FreeBSD] 20040728] on freebsd5 > Type "help", "copyright", "credits" or "license" for more information. > >>> u"é" > u'\xe9' I installed my own Python 2.4 in /opt/python/. I don't know if the right code.py is loaded, but here is the output : =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- $ ./python2.4 Python 2.4.1 (#1, Sep 11 2005, 01:37:26) [GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> u"é" u'\xe9' >>> import code >>> code.interact() Python 2.4.1 (#1, Sep 11 2005, 01:37:26) [GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2 Type "help", "copyright", "credits" or "license" for more information. (InteractiveConsole) >>> u"é" u'\xc3\xa9' =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Well, that works better :-) For code.interact(), you can read my attached patch. I don't know if it the best way to fix the but. But, the following code still bug in Python 2.4 : =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- $ cat python_unicode_eval_bug.py #*- coding: UTF-8 -*- print "One Unicode character: %u" % len(u"é") print "One Unicode character (using eval) : %u" % eval('len(u"é")') $ python2.4 python_unicode_eval_bug.py One Unicode character: 1 One Unicode character (using eval) : 2 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- RexFi explains me that Python can't guess eval('len(u"é")') charset. Yep, that's difficult: locale? charset encoding? This test doesn't matter. @+, Haypo -------------- next part -------------- A non-text attachment was scrubbed... Name: code-interact.patch Type: text/x-patch Size: 407 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20050913/c620d813/code-interact.bin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4