Michael Hudson wrote: > > Well, we have the first 2.2 bugfix that isn't a no-brainer to port to > 2.2.1. This is to do with the > > [ #495401 ] Build troubles: --with-pymalloc > > bug. > > As far as understand it, there were two problems. > > 1) with wide unicode characters, some function in unicodeobject.c to > do with interpreting escape codes could write into memory it didn't > own. > > 2) something to do with the handling of "unpaired high surrogates" in > the utf-8 codec. > > Were these problems related? I think they got fixed at the same time, > but I may have gotten confused. Right. 1) was caused by 2). Both are fixed now. > 1) shouldn't be too much of an issue to get into 2.2.1 (there was some > contention about which fix performed better, but for 2.2.1 I don't > care too much). > > 2) is more troublesome, because to fix it properly breaks .pycs, in > turn because marshal uses the utf-8 codec to store unicode string > constants, and this is a no-no according to PEP 6. > > Is it possible to worm around 2) by reconstructing valid strings from > the bad marshal data, or has information been lost? How severe is the > bug? Maybe it would be best to leave it unfixed in 2.2.1. Well, I posted a message to python-dev or the checkins list about this (don't remember). The situation is basically like this: In Python <= 2.2.0, you could write u = u"\uD800" in a .py file. The first time you import this file, Python will create a .pyc file for it using the broken UTF-8 encoding. The import will succeed. The second time you import the module, Python will try to use the .pyc file. Now reading that file in fails with a UnicodeError and Python also does not revert to the .py file. As a result, modules using unpaired surrogates in Unicode literals are simply broken in Python <= 2.2.0. The problem with backporting this patch is that in order for Python to properly recompile any broken module, the magic will have to be changed. Question is whether this is a reasonable thing to do in a patch level release... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4