There has been some lively discussion about the iteration protocols lately. My impression of the opinions on the list so far is this: It could have been semantically cleaner. There is a blurred boundary between the iterable-container and iterator protocols. Perhaps next should have been called __next__. Perhaps iterators should not have been required to implement an __iter__ method returning self. With the benefit of hindsight the protocols could have been designed better. But there is nothing fundamentally broken about iteration. Nothing that justifies any serious change that would break backward compatibility and require a transition plan. A remaining sore spot is re-iterability. Iterators being their own iterators is ok by itself. StopIteration being a sink state is ok by itself. When they are combined they result in hard-to-trace silent errors because an exhausted iterator is indistinguishable from an empty container. This happens in real code, not in some contrived examples. It is clear to me that this issue needs to be addressed in some way, but without a complete redesign of the iteration protocols. My proposal of raising an exception on calling .next() after StopIteration has been rejected by Guido. Here's another approach: Proposal: new built-in function reiter() def reiter(obj): """reiter(obj) -> iterator Get an iterator from an object. If the object is already an iterator a TypeError exception will be raised. For all Python built-in types it is guaranteed that if this function succeeds the next call to reiter() will return a new iterator that produces the same items unless the object is modified. Non-builtin iterable objects which are not iterators SHOULD support multiple iteration returning the same items.""" it = iter(obj) if it is obj: raise TypeError('Object is not re-iterable') return it Example: def cartprod(a,b): """ Generate the cartesian product of two sources. """ for x in a: for y in reiter(b): yield x,y This function should raise an exception if object b is a generator or some other non re-iterable object. List comprehensions should use the C API equivalent of reiter for sources other than the first. This solution is less than perfect. It requires explicit attention by the programmer and is less comprehensive than the other solutions proposed but I think it's better than nothing. A related issue is iteration of files. It's an exception for the guarantee made in the docstring above. My impression is that people generally agree that file objects are more iterator-like than container-like because they are stateful cursors. However, making files into iterators is not as simple as adding a next method that calls readline and raises StopIteration on EOF. This implementation would lose the performance benefit from the readahead bufering done in the xreadlines object. The way I see file object iteration is that the file object and xreadlines object abuse the iterable-container<->iterator relationship to produce a cursor-without-readahead-buffer<->cursor-with-readahead-buffer relationship. I don't like objects pretending to be something they're not. I can finish my xreadlines caching patch that makes a file into an iterator with an embedded xreadlines object. Perhaps it's not the most elegant solution but I don't see any real problems with it. I am also thinking about implementing line buffering inside the file object that can finally get rid of the whole fgets/getc_unlocked multiplatform mess and make xreadlines unnecessary. The problem here is that readahead is not exactly a transparent operation. More on this later. Oren
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4