[Moshe discovers that u"a" in "bbba" raises TypeError] [Marc-Andre] > > Hmm, this must have been introduced by your contains code... > > it did work before. > > Nope: the string "in" semantics were forever special-cased. Guido beat me > soundly for trying to change the semantics... But I believe that Marc-Andre added a special case for Unicode in PySequence_Contains. I looked for evidence, but the last snapshot that I actually saved and built before Moshe's code was checked in is from 2/18 and it isn't in there. Yet I believe Marc-Andre. The special case needs to be added back to string_contains in stringobject.c. > > The normal action taken by the Unicode and the string > > code in these mixed type situations is to first > > convert everything to Unicode and then retry the operation. > > Strings are interpreted as UTF-8 during this conversion. > > Hmmm....PySeqeunce_Contains doesn't do any conversion of the arguments. > Should it? (Again, it didn't before). If it does, then the order of > testing for seq_contains and seq_getitem and conversions Or it could be done this way. > > Perhaps I should also add a tp_contains slot to the > > Unicode object which then uses the above API as well. Yes. > But that wouldn't help at all for > > u"a" in "abbbb" It could if PySeqeunce_Contains would first look for a string and a unicode argument (in either order) and in that case convert the string to unicode. > PySequence_Contains only dispatches on the container argument :-( > > (BTW: I discovered it while contemplating adding a seq_contains (not > tp_contains) to unicode objects to optimize the searching for a bit.) You may beat Marc-Andre to it, but I'll have to let him look at the code anyway -- I'm not sufficiently familiar with the Unicode stuff myself yet. BTW, I added a tag "pre-unicode" to the CVS tree to the revisions before the Unicode changes were made. --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4