> I also note that the current tee() doesn't let you use __copy__ easily > (it would be quite messy I think). The linked-list version supports > __copy__ trivially. This may be important if we execute (as I think > we should) on the idea of making selected iterators __copy__-able > (especially all the standard container iterators and xrange). The current tee() was written to support only a two way split, but it can easily be cast as a multi-way splitter with no problem. The only real difference in the ideas presented so far are whether the underlying queue should be implemented as a singly linked list or as a double stack. As a proof-of-concept, here is GvR's code re-cast with the queue changed to a double stack implementation. The interface is completely unchanged. The memory consumed is double that the current tee() but much less than the linked list version. The speed is half that of the current tee() and roughly comparable to or slightly better than the linked list version. Raymond Hettinger ------------------------------------------------------------------------ -- """ Guido's demo program re-cast with a different underlying data structure Replaces the linked list based queue with a two stack based queue. Advantages: The double stack method consumes only two pointers per data element while the linked list method consumes space for a link object (8 to 10 words). The double stack method uses contiguous memory while the link objects are more fragmented. The stack method uses append() and pop() which are optimized to minimize memory management calls. For the link method, every link costs a malloc and free. Todo: Handle Wrappers that are GC'd before termination. Add support for holding an exception. """ class TeeMaster(object): """Holder for information common to wrapped iterators """ def __init__(self, it): self.inbasket = [] self.inrefcnt = [] self.outbasket = [] self.outrefcnt = [] self.numseen = 0 self.it = it self.numsplits = 0 class Wrapper(object): """Copyable wrapper around an iterator. Any number of Wrappers around the same iterator share the TeeMaster object. The Wrapper that is most behind will drop the refcnt to zero, which causes the reference to be popped off of the queue. The newest Wrapper gets a brand new TeeMaster object. Later wrappers share an existing TeeMaster object. Since they may have arrived late in the game, they need to know how many objects have already been seen by the wrapper. When they call next(), they ask for the next numseen. If a Wrapper is garbage-collected before it finishes, the refcount floor needs to be raised. That has not yet been implemented. """ __slots__ = ["master", "numseen"] def __init__(self, it, master=None): """Constructor. The master argument is used by __copy__ below.""" if master is None: master = TeeMaster(it) self.master = master self.numseen = master.numseen self.master.numsplits += 1 def __copy__(self): """Copy the iterator. This returns a new iterator that will return the same series of results as the original. """ return Wrapper(None, self.master) def __iter__(self): """All iterators should support __iter__() returning self.""" return self def next(self): """Get the next value of the iterator, or raise StopIteration.""" master = self.master inbasket, inrefcnt = master.inbasket, master.inrefcnt if master.numseen == self.numseen: # This is the lead dog so get a value through the iterator value = master.it.next() master.numseen += 1 # Save it for the other dogs inbasket.append(value) inrefcnt.append(master.numsplits-1) self.numseen += 1 return value # Not a lead dog -- the view never changes :-( location = len(inbasket) - (master.numseen - self.numseen) if location >= 0: # Our food is in the inbasket value = inbasket[location] inrefcnt[location] -= 1 rc = inrefcnt[location] else: # Our food is in the outbasket location = -(location + 1) value = master.outbasket[location] master.outrefcnt[location] -= 1 rc = master.outrefcnt[location] # Purge doggie bowl when no food is left if rc == 0: if len(master.outbasket) == 0: master.outbasket, master.inbasket = master.inbasket, master.outbasket master.outrefcnt, master.inrefcnt = master.inrefcnt, master.outrefcnt master.outbasket.reverse() master.outrefcnt.reverse() master.outbasket.pop() master.outrefcnt.pop() self.numseen += 1 return value def tee(it): """Replacement for Raymond's tee(); see examples in itertools docs.""" if not hasattr(it, "__copy__"): it = Wrapper(it) return (it, it.__copy__()) def test(): """A simple demonstration of the Wrapper class.""" import random def gen(): for i in range(10): yield i it = gen() a, b = tee(it) b, c = tee(b) c, d = tee(c) iterators = [a, b, c, d] while iterators != [None, None, None, None]: i = random.randrange(4) it = iterators[i] if it is None: next = "----" else: try: next = it.next() except StopIteration: next = "****" iterators[i] = None print "%4d%s%4s%s" % (i, " ."*i, next, " ."*(3-i)) if __name__ == "__main__": test()
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4