> I hope you don't mind that i'm taking this over to python-dev, > because it led me to discover a more general issue (see below). No -- in fact I wanted to see this here! (My mail backlog seems to be clearing -- or maybe it was only a temporary unclogging... :-) > For the others on python-dev, here's the background: MAL was > about to check in the unistr() function, described as follows: > > > This patch adds a utility function unistr() which works just like > > the standard builtin str() -- only that the return value will > > always be a Unicode object. > > > > The patch also adds a new object level C API PyObject_Unicode() > > which complements PyObject_Str(). > > I responded: > > Why are unistr() and unicode() two separate functions? > > > > str() performs one task: convert to string. It can convert anything, > > including strings or Unicode strings, numbers, instances, etc. > > > > The other type-named functions e.g. int(), long(), float(), list(), > > tuple() are similar in intent. > > > > Why have unicode() just for converting strings to Unicode strings, > > and unistr() for converting everything else to a Unicode string? > > What does unistr(x) do differently from unicode(x) if x is a string? > > MAL responded: > > unistr() is meant to complement str() very closely. unicode() > > works as constructor for Unicode objects which can also take > > care of decoding encoded data. str() and unistr() don't provide > > this capability but instead always assume the default encoding. > > > > There's also a subtle difference in that str() and unistr() > > try the tp_str slot which unicode() doesn't. unicode() > > supports any character buffer which str() and unistr() don't. > > Okay, given this explanation, i still feel fairly confident > that unicode() should subsume unistr(). Many of the other > type-named functions try various slots: > > int() looks for __int__ > float() looks for __float__ > long() looks for __long__ > str() looks for __str__ > > In testing this i also discovered the following: > > >>> class Foo: > ... def __int__(self): > ... return 3 > ... > >>> f = Foo() > >>> int(f) > 3 > >>> long(f) > Traceback (most recent call last): > File "<stdin>", line 1, in ? > AttributeError: Foo instance has no attribute '__long__' > >>> float(f) > Traceback (most recent call last): > File "<stdin>", line 1, in ? > AttributeError: Foo instance has no attribute '__float__' > > This is kind of surprising. How about: > > int() looks for __int__ > float() looks for __float__, then tries __int__ > long() looks for __long__, then tries __int__ > str() looks for __str__ > unicode() looks for __unicode__, then tries __str__ For the numeric types this could perhaps be done by calling PyNumber_Long() from PyNumber_Float(), calling PyNumber_Int() from PyNumber_Long(). Complex is a bit of an exception -- there's no PyNumber_Complex(), just because I felt that nobody would need it. :-) > The extra parameter to unicode() is very similar to the extra > parameter to int(), so i think there is a natural parallel here. Makes sense. > Hmm... what about the other types? > > Wow!! __complex__ can produce a segfault! > > >>> complex > <built-in function complex> > >>> class Foo: > ... def __complex__(self): return 3 > ... > >>> Foo() > <__main__.Foo instance at 0x81e8684> > >>> f = _ > >>> complex(f) > Segmentation fault (core dumped) > > This happens because builtin_complex first retrieves and saves > the PyNumberMethods of the argument (in this case, from the > instance), then tries to call __complex__ (in this case, returning 3), > and THEN coerces the result using nbr->nb_float if the result is > not complex! (This calls the instance's nb_float method on the > integer object 3!!) Thanks! Fixed now in CVS. > I think __complex__ should probably look for __complex__, then > __float__, then __int__. I make it call PyNumber_Float(), which could be made smarter as explained above. > One could argue for __list__, __tuple__, or __dict__, but that > seems much weaker; the Pythonic way has always been to implement > __getitem__ instead. Yes -- since __list__ etc. aren't used, let's not add them. > There is no built-in dict(); if it existed > i suppose it would do the opposite of x.items(); again a weak > argument, though i might have found such a function useful once > or twice. Yeah, it's not very common. Dict comprehensions anyone? d = {k:v for k,v in zip(range(10), range(10))} # :-) > And that about covers the built-in types for data. Thanks! --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4