I've been thinking a bit about this myself, and I think it's good idea to show the bytecodes generated for the various cases, just to make sure that we understand the semantics. I'll just use +=, but the same list applies to all 11 binary operators (**, *, /, %, +, -, |, &, ^, <<, >>). I'm making up opcodes -- the different variants of LOAD and STORE don't matter. On the right I'm displaying the stack contents after execution of the opcode (push appends to the end). I'm writing 'result' to indicate the result of the += operator. a += b LOAD a [a] LOAD b [a, b] AUGADD [result] STORE a [] a.attr += b LOAD a [a] DUP [a, a] GETATTR 'attr' [a, a.attr] LOAD b [a, a.attr, b] AUGADD [a, result] SETATTR 'attr' [] a[i] += b LOAD a [a] DUP [a, a] LOAD i [a, a, i] DUP [a, a, i, i] ROT3 [a, i, a, i] GETITEM [a, i, a[i]] LOAD b [a, i, a[i], b] AUGADD [a, i, result] SETITEM [] I'm leaving the slice variant out; I'll get to that in a minute. If the right hand side is more complicated than in the example, the line 'LOAD b' is simply replaced by code that calculates the value of the expression; this always ends up eventually pushing a single value onto the stack, leaving anything below it alone, just like the 'LOAD b' opcode. Ditto for the index expression ('i' in the example). Similarly, for the cases a.attr and a[i], if instead of a there's a more complicated expression (e.g. sys.modules[name].foo().bar += 1) the initial 'LOAD a' is replaced by code that loads the object on the stack -- in this example, sys.modules[name].foo(). Only the final selector (".attr", "[i]") is special. (Terminology: a selector is something that addresses a possibly writable component of a container object, e.g. a[i] or a.attr; a[i:j] is also a selector. f() could be seen as a selector but cannot be used on the left hand side of an assignment.) There are two more forms of potential interest. First, what should happen to a tuple assignment? a, b, c += x (Which is exactly the same as "(a, b, c) += x".) I think this should be a compile-time error. If t and u are tuples, "t += u" means the same as "t = t+u"; but if we apply this rule we would get "(a, b, c) = (a, b, c) + u", which is only valid if u is an empty tuple (or a class instance with unusual coercion behavior). But when u is empty, it's not useful (nothing changes), so it's unlikely that someone would have this intention. More likely, the programmer was hoping that this would be the same as "a+=x; b+=x; c+=x" -- but that's the same misconception as expecting "a, b, c = 0" to mean "a = b = c = 0" so we don't need to cater to it. Second, what should happen to a slice assignment? The basic slice form is: a[i:j] += b but there are others: Python's slice syntax allows an arbitrary comma-separated sequence of single indexes, regular slices (lo:hi), extended slices (lo:hi:step), and "ellipsis" tokens ('...') between the square brackets. Here's an extreme example: a[:, ..., ::, 0:10:2, :10:, 1, 2:, ::-1] += 1 First, let me indicate what code is generated for such a form when it's used in a regular expression or assignment. Any such form *except* basic slices (a[i:j], a[:j], a[i:], and a[:]) is translated into code that uses GETITEM or SETITEM with an index that is formed from a simple translation of the actual expressions. - If there are two or more comma-separated values, the index is a tuple of the translations of the individual values. - An ellipsis ("...") is translated into the builtin object Ellipsis. - A non-slice is translated into itself. - A slice is translated into a "slice object", this is a built-in object representing the lower and upper bounds and step. There is also a built-in function, slice(), taking 1-3 arguments in the same way as range(). Thus: - "lo:hi" is equivalent to slice(lo, hi); - "lo:hi:step" is equivalent to slice(lo, hi, step); - omitted values are replaced with None, so e.g. ":hi" is equivalent to slice(None, hi). So, the extreme example above means exactly the same as a[x], where x is a tuple with the following items: slice(None, None) Ellipsis slice(None, None, None) slice(0, 10, 2) slice(None, 10, None) 1 slice(2, None) slice(None, None, -1) Why all this elaboration? Because I want to use this to give a standardized semantics even to basic slices. If a[lo:hi:step] is translated the same as a[slice(lo, hi, step)], then we can give a[lo:hi] the same translation as a[slice(lo, hi)], and thus the slice case for augmented assignment can generate the same code (apart from the slice-building operations) as the index case. Thus (writing 'slice' to indicate the slice object built from i and j): a[i:j] += b LOAD a [a] DUP [a, a] LOAD i [a, a, i] ** LOAD j [a, a, i, j] ** BUILD_SLICE 2 [a, a, slice] ** DUP [a, a, slice, slice] ROT3 [a, slice, a, slice] GETITEM [a, slice, a[slice]] LOAD b [a, slice, a[slice], b] AUGADD [a, slice, result] SETITEM [] Comparing this to the code for "a[i] += b", only the three lines marked with ** are really different, and all that these do is to push a single object representing the slice onto the stack. I won't show the code for "a[i:j:k] += b" or for "a[i:j, k:l]", but it's clear how these should be done. Postscript (unrelated to augmented assignment) It would be nice if the SLICE bytecodes were removed altogether and instead slice() objects would be created for all slices, even basic ones. (I believe this was proposed in this list at some point.) The original SLICE opcodes were introduced in ancient times, when basic slices were the only accepted slice syntax. This would mean that all objects supporting slices would have to support the *mapping* interface instead of (or in addition to) the sequence interface; the mapping interface would have to determine whether a getitem / setitem call was really a slice call and do the right thing. In particular, for backward compatibility, class instances could have a mapping interface whose internal getitem function checks if the argument is a slice object whose step is None and whose lo and hi are None or integers; then if a __getslice__ method exists, it could call that, in all other cases it could call __getitem__. None of the other built-in objects that support slices would have to be changed; the GETITEM opcode could notice that an object supports the sequence interface but not the mapping interface, and then look for a basic slice or an integer and do the right thing. Problems with this are mostly related to the existing C API for slices, like PySequence_GetSlice(), which propagate the various restrictions. Too-much-rambling-reduces-the-chance-of-useful-responses-ly, --Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4