I'd like to query for the common opinion on an issue which I've run into when trying to resynchronize unicode() and str() in terms on what happens when you pass arbitrary objects to these constructors which happen to implement tp_str (or __str__ for instances). Currenty, str() will accept any object which supports the tp_str interface and revert to tp_repr in case that slot should not be available. unicode() supported strings, character buffers and instances having a __str__ method before yesterdays checkins. Now the goal of the checkins was to make str() and unicode() behave in a more compatible fashion. Both should accept the same kinds of objects and raise exceptions for all others. The path I chose was to fix PyUnicode_FromEncodedObject() to also accept tp_str compatible objects. This API is used by the unicode_new() constructor (which is exposed as unicode() in Python) to create a Unicode object from the input object. str() OTOH uses PyObject_Str() via string_new(). Now there also is a PyObject_Unicode() API which tries to mimic PyObject_Str(). However, it does not support the additional encoding and errors arguments which the unicode() constructor has. The problem which Guido raised about my checkins was that the changes to PyUnicode_FromEncodedObject() are seen not only in unicode(), but also all other instances where this API is used. OTOH, PyUnicode_FromEncodedObject() is the most generic constructor for Unicode objects there currently is in Python. So the questions are - should I revert the change in PyUnicode_FromEncodedObject() and instead extend PyObject_Unicode() to support encodings ? - should we make PyUnicode_Object() use=20 PyUnicode_FromEncodedObject() instead of providing its own implementation ? The overall picture of all this auto-conversion stuff going on in str() and unicode() is very confusing. Perhaps what we really need is first to agree on a common understanding of which auto-conversion should take place and then make str() and unicode() support exactly the same interface ?! PS: Also see patch #446754 by Walter D=F6rwald: http://sourceforge.net/tracker/?func=3Ddetail&atid=3D305470&aid=3D446754&= group_id=3D5470 --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4