> Ka-Ping Yee wrote: > > > > On Thu, 18 Jan 2001, Ka-Ping Yee wrote: > > > str() looks for __str__ > > > > Oops. I forgot that > > > > str() looks for __str__, then tries __repr__ > > > > So, presumably, > > > > unicode() should look for __unicode__, then __str__, then __repr__ > > Not quite... str() does this: > > 1. strings are passed back as-is > 2. the type slot tp_str is tried > 3. the method __str__ is tried > 4. Unicode returns are converted to strings > 5. anything other than a string return value is rejected > > unistr() does the same, but makes sure that the return > value is an Unicode object. > > unicode() does the following: > > 1. for instances, __str__ is called > 2. Unicode objects are returned as-is > 3. string objects or character buffers are used as basis for decoding > 4. decoding is applied to the character buffer and the results > are returned > > I think we should perhaps merge the two approaches into one > which then applies all of the above in unicode() (and then > forget about unistr()). This might lose hide some type errors, > but since all other generic constructors behave more or less > in the same way, I think unicode() should too. Yes, I would like to see these merged. I noticed that e.g. there is special code to compare Unicode strings in the comparison code (I think I *could* get rid of this now we have rich comparisons, but I decided to put that off), and when I looked at it it uses the same set of conversions as unicode(). Some of these seem questionable to me -- why do you try so many ways to get a string out of an object? (On the other hand the merge of unicode() and unistr() might have this effect anyway...) --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4