[Samuele] > If > > "thon" in "python" > > then why not > > [1,2] in [0,1,2,3] > > (it's a purely rhetorical question) > > in general I don't think it is a good idea > to have "in" be a membership vs subset/subseq > operator depending on non ambiguity, convenience > or simply implementer taste, > because truly there are data types (ex. sets) > that would need both and disambiguated. > > Either python grows a new subset/subseq operator > but probably this is overkill (keyword issue, new > __magic__ method, not meaningful, con > venient for a lot of types) > > or strings (etc) should simply grow a new > method with an appropriate name. I recognize this as related to the argument that Ping was (still is?) making against "for x in <iterator>"; but not because the same operator "in" is involved. It has to do with polymorphism (functions that accept different types of arguments; it's somewhat different from operator overloading). Suppose we have an operator @. (Take operator in a wide enough sense, including other bits of grammar, like "for".) If there's only one type (or one narrow set or related types) for which @ makes sense, human readers of a program will use @ as a clue about the type of the arguments, and (if correct) that will help reasoning about the expression in which it occurs. ABC uses this property of operators to do type inference: if an ABC expression contains "a+b", a and b must be numbers; and so on. Python chose to allow operators to be overloaded by different types with different meanings, and the language gives a+b a very different meaning for numbers than for sequences, for example. (And an important invariant is lost in this example: for numbers, a+b == b+a, but not so for sequences!) Is this a problem? The ease with which we get used to "key in dict" makes me think it is not. While Python doesn't require you to declare the types of your arguments, the type (or set of allowed types) for arguments is usually strongly known in the mind of the programmer, and most often strong hints are given either by the choice of argument name or by documentation. While it's possible in theory, in practice nobody writes polymorphic code that uses + and * on its arguments and yet accepts both numbers and strings. The reality is that some types are more related than others, and the substitutability property only makes sense for types that are sufficiently related. We *do* write code that accepts any kind of sequence, including strings. We do *not* write code that accepts any kind of container (sequence or mapping), even though some operations apply to both kinds of container (len, a[b], and since 2.2, x in a). In code that applies to all (or even just some) kinds of sequences, the 'in' operator will continue to stand for membership. This won't cause a problem with strings: correct code using 'in' for membership will never use seq1 in seq2, it will use item in seq, where the type of item is "whatever the type of seq[0] is, if it exists." When the seq is a string, item will be a one-char string -- not a "type" in Python's type system, but certainly a useful concept. But there's also lots of code that deals only with strings. This is normally be completely clear to the casual reader: either because string literals are used, compared, etc., or because values are obtained from functions known to return strings (such as file.readline()), or because methods unique to strings (e.g. s.lower() are used, and so on. Strings are very important in lots of programs, and we want our notations for string operations to be readable and expressive. (Regular expressions are extreme in expressiveness, but lack readability, which is why they're relegated to an imported module in Python.) Substring containment testing is a common operation on strings, so being able to write it as 's1 in s2' rather than 's2.find(s1) >= 0' is a big win, IMO. PS. Sets are a different case again. They are containers but neither sequences nor mappings (though depending on what you want to do they can resemble either). We will have to think about which operators make sense for them. I'd say that 'elem in set' is an appropriate way to spell set membership; how to spell subset is a matter of discussion (maybe 'set1 <= set2' is a good idea; maybe not). --Guido van Rossum (home page: http://www.python.org/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4