All, Below is the first draft of PEP 276 "Simple Iterator for ints". (Available at http://python.sourceforge.net/peps/pep-0276.html) Feel free to comment (positive, negative, bipolar, or in between ;-). Please copy me on any comments that you think should be incorporated into the next revision (so that I don't unintentionally miss something on python-list). Thanks, Jim ===================================== PEP: 276 Title: Simple Iterator for ints Version: $Revision: 1.1 $ Last-Modified: $Date: 2001/11/13 20:52:37 $ Author: james_althoff@i2.com (Jim Althoff) Status: Draft Type: Standards Track Created: 12-Nov-2001 Python-Version: 2.3 Post-History: Abstract Python 2.1 added new functionality to support iterators[1]. Iterators have proven to be useful and convenient in many coding situations. It is noted that the implementation of Python's for-loop control structure uses the iterator protocol as of release 2.1. It is also noted that Python provides iterators for the following builtin types: lists, tuples, dictionaries, strings, and files. This PEP proposes the addition of an iterator for the builtin type int (types.IntType). Such an iterator would simplify the coding of certain for-loops in Python. Specification Define an iterator for types.intType (i.e., the builtin type "int") that is returned from the builtin function "iter" when called with an instance of types.intType as the argument. The returned iterator has the following behavior: - Assume that object i is an instance of types.intType (the builtin type int) and that i > 0 - iter(i) returns an iterator object - said iterator object iterates through the sequence of ints 0,1,2,...,i-1 Example: iter(5) returns an iterator object that iterates through the sequence of ints 0,1,2,3,4 - if i <= 0, iter(i) returns an "empty" iterator, i.e., one that throws StopIteration upon the first call of its "next" method In other words, the conditions and semantics of said iterator is consistent with the conditions and semantics of the range() and xrange() functions. Note that the sequence 0,1,2,...,i-1 associated with the int i is considered "natural" in the context of Python programming because it is consistent with the builtin indexing protocol of sequences in Python. Python lists and tuples, for example, are indexed starting at 0 and ending at len(object)-1 (when using positive indices). In other words, such objects are indexed with the sequence 0,1,2,...,len(object)-1 Rationale A common programming idiom is to take a collection of objects and apply some operation to each item in the collection in some established sequential order. Python provides the "for in" looping control structure for handling this common idiom. Cases arise, however, where it is necessary (or more convenient) to access each item in an "indexed" collection by iterating through each index and accessing each item in the collection using the corresponding index. For example, one might have a two-dimensional "table" object where one requires the application of some operation to the first column of each row in the table. Depending on the implementation of the table it might not be possible to access first each row and then each column as individual objects. It might, rather, be possible to access a cell in the table using a row index and a column index. In such a case it is necessary to use an idiom where one iterates through a sequence of indices (indexes) in order to access the desired items in the table. (Note that the commonly used DefaultTableModel class in Java-Swing-Jython has this very protocol). Another common example is where one needs to process two or more collections in parallel. Another example is where one needs to access, say, every second item in a collection. There are many other examples where access to items in a collection is facilitated by a computation on an index thus necessitating access to the indices rather than direct access to the items themselves. Let's call this idiom the "indexed for-loop" idiom. Some programming languages provide builtin syntax for handling this idiom. In Python the common convention for implementing the indexed for-loop idiom is to use the builtin range() or xrange() function to generate a sequence of indices as in, for example: for rowcount in range(table.getRowCount()): print table.getValueAt(rowcount, 0) or for rowcount in xrange(table.getRowCount()): print table.getValueAt(rowcount, 0) From time to time there are discussions in the Python community about the indexed for-loop idiom. It is sometimes argued that the need for using the range() or xrange() function for this design idiom is: - Not obvious (to new-to-Python programmers), - Error prone (easy to forget, even for experienced Python programmers) - Confusing and distracting for those who feel compelled to understand the differences and recommended usage of xrange() vis-a-vis range() - Unwieldy, especially when combined with the len() function, i.e., xrange(len(sequence)) - Not as convenient as equivalent mechanisms in other languages, - Annoying, a "wart", etc. And from time to time proposals are put forth for ways in which Python could provide a better mechanism for this idiom. Recent examples include PEP 204, "Range Literals", and PEP 212, "Loop Counter Iteration". Most often, such proposal include changes to Python's syntax and other "heavyweight" changes. Part of the difficulty here is that advocating new syntax implies a comprehensive solution for "general indexing" that has to include aspects like: - starting index value - ending index value - step value - open intervals versus closed intervals versus half opened intervals Finding a new syntax that is comprehensive, simple, general, Pythonic, appealing to many, easy to implement, not in conflict with existing structures, not excessively overloading of existing structures, etc. has proven to be more difficult than one might anticipate. The proposal outlined in this PEP tries to address the problem by suggesting a simple "lightweight" solution that helps the most common case by using a proven mechanism that is already available (as of Python 2.1): namely, iterators. Because for-loops already use "iterator" protocol as of Python 2.1, adding an iterator for types.IntType as proposed in this PEP would enable by default the following shortcut for the indexed for-loop idiom: for rowcount in table.getRowCount(): print table.getValueAt(rowcount, 0) The following benefits for this approach vis-a-vis the current mechanism of using the range() or xrange() functions are claimed to be: - Simpler, - Less cluttered, - Focuses on the problem at hand without the need to resort to secondary implementation-oriented functions (range() and xrange()) And compared to other proposals for change: - Requires no new syntax - Requires no new keywords - Takes advantage of the new and well-established iterator mechanism And generally: - Is consistent with iterator-based "convenience" changes already included (as of Python 2.1) for other builtin types such as: lists, tuples, dictionaries, strings, and files. Preliminary discussion on the Python interest mailing list suggests a reasonable amount of initial support for this PEP (along with some dissents/issues noted below). Backwards Compatibility The proposed mechanism is generally backwards compatible as it calls for neither new syntax nor new keywords. All existing, valid Python programs should continue to work unmodified. However, this proposal is not perfectly backwards compatible in the sense that certain statements that are currently invalid would, under the current proposal, become valid. Tim Peters has pointed out two such examples: 1) The common case where one forgets to include range() or xrange(), for example: for rowcount in table.getRowCount(): print table.getValueAt(rowcount, 0) in Python 2.2 raises a TypeError exception. Under the current proposal, the above statement would be valid and would work as (presumably) intended. Presumably, this is a good thing. As noted by Tim, this is the common case of the "forgotten range" mistake (which one currently corrects by adding a call to range() or xrange()). 2) The (hopefully) very uncommon case where one makes a typing mistake when using tuple unpacking. For example: x, = 1 in Python 2.2 raises a TypeError exception. Under the current proposal, the above statement would be valid and would set x to 0. The PEP author has no data as to how common this typing error is nor how difficult it would be to catch such an error under the current proposal. He imagines that it does not occur frequently and that it would be relatively easy to correct should it happen. Issues: Based on some preliminary discussion on the Python interest mailing list, the following concerns have been voiced: - Is it obvious that iter(5) maps to the sequence 0,1,2,3,4? Response: Given, as noted above, that Python has a strong convention for indexing sequences starting at 0 and stopping at (inclusively) the index whose value is one less than the length of the sequence, it is argued that the proposed sequence is reasonably intuitive to a Python programmer while being useful and practical. - "in" (as in "for i in x") does not match standard English usage in this case. "up to" or something similar might be better. Response: Not everyone felt that matching standard English perfectly is a requirement. It is noted that "for:else:" doesn't match standard English very well either. And few are excited about adding a new keyword, especially just to get a somewhat better match to standard English usage. - Possible ambiguity for i in 10: print i might be mistaken for for i in (10,): print i Response: The predicted ambiguity was not readily apparent to several of the posters. - It would be better to add special new syntax such as: for i in 0..10: print i Response: There are other PEPs that take this approach[2][3]. - It would be better to reuse the ellipsis literal syntax (...) Response: Shares disadvantages of other proposals that require changes to the syntax. Needs more design to determine how it would handle the general case of start,stop,step, open/closed/half-closed intervals, etc. Needs a PEP. - It would be better to reuse the slicing literal syntax attached to the int class, e.g., int[0:10] Response: Same as previous response. In addition, design consideration needs to be given to what it would mean if one uses slicing syntax after some arbitrary class other than class int. Needs a PEP. - Might dissuade newbies from using the indexed for-loop idiom when the standard "for item in collection:" idiom is clearly better. Response: The standard idiom is so nice when "it fits" that it needs neither extra "carrot" nor "stick". On the other hand, one does notice cases of overuse/misuse of the standard idiom (due, most likely, to the awkwardness of the indexed for-loop idiom), as in: for item in sequence: print sequence.index(item) - Doesn't handle the general case of start,stop,step Response: use the existing range() or xrange() mechanisms. Or, see below. Extension If one wants to handle general indexing (start,stop,step) without having to resort to using the range() or xrange() functions then the following could be incorporated into the current proposal. Add an "iter" method (or use some other preferred name) to types.IntType with the following signature: def iter(start=0, step=1): This method would have the (hopefully) obvious semantics. Then one could do, for example: x = 100 for i in x.iter(start=1, step=2): print i Under this extension (for x bound to an int), for i in x: would be equivalent to for i in x.iter(): and to for i in x.iter(start=0, step=1): This extension is consistent with the generalization provided by the current mechanism for dictionaries whereby one can use: for k in d.iterkeys(): for v in d.itervalues(): for k,v in d.iteritems(): depending on one's needs, given that for i in d: has a meaning aimed at the most common and useful case (d.iterkeys()). Implementation An implementation is not available at this time and although the author is not qualified to comment on such he will, nonetheless, speculate that this might be straightforward and, hopefully, might consist of little more than setting the tp_iter slot in types.IntType to point to a simple iterator function that would be similar to -- or perhaps even a wrapper around -- the xrange() function. References [1] PEP 234, Iterators http://python.sourceforge.net/peps/pep-0234.html [2] PEP 204, Range Literals http://python.sourceforge.net/peps/pep-0204.html [3] PEP 212, Loop Counter Iteration http://python.sourceforge.net/peps/pep-0212.html Copyright This document has been placed in the public domain.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4