Just for fun, although someone may find it useful <shudder>. The new heapq module implements a classic min-heap, a binary tree where the value at each node is <= the values of its children. I mentioned weak heaps before (in the context of sorting), where that condition is loosened to cover just the right child (and the root node of the whole weak heap isn't allowed to have a left child). This is more efficient, but the code is substantially trickier. A different kind of weakening is the so-called "pairing heap" (PH). This is more like the classic (strong) heap, except that it's a general tree, with no constraint on how many children each node can have (0, 1, 2, ..., thousands). The parent value simply has to be <= the values of all its children (if any). Leaving aside storage efficiency, the code for this is substantially simpler than even for classic heaps: two PHs can be merged in constant time, with a single compare (whichever PH has the larger root value simply becomes another child of the other PH's root node). The code below uses a funky representation where a PH is just a list L, where L[0] is the value associated with the PH's root node (which always has the smallest value in the tree), and the rest of the list consists of 0 or more child PHs (which are again lists of the same form). All the usual heap operations build on this simple "pairing" operation, called _link below. Pushing an element x on the heap consists of viewing x as the 0-child PH [x], and one link step completes merging it with the existing PH. Any collection of N values can thus be turned into a PH using exactly N-1 compares. A pop seems scary at first, since we may have one root node with N-1 children, and then it will take at least N-2 pairing steps to turn the remaining forest of PHs back into a single PH. Indeed, this happens if you feed the numbers 1..N into an empty PH in order (each of 2 thru N becomes a direct child of 1). There are many ways the forest-merge step can done; the code below implements a common way, with the remarkable property that, despite the possibility for an O(N) pop step, the amortized cost for N pops is worst-case O(log N). In the "bad example" of inserting 1 thru N in order, it actually turns out to be amortized constant time (it doesn't matter how big N is, there's an independent (and small) constant c such that the N pops take no more than c*N compares). You have see that to believe it, though <wink>. PHs are an active area of current research. They appear to have many remarkable "adaptive" properties, but it seems difficult to prove or disprove interesting general conjectures. Playing around with the code and a class that counts __cmp__ invocations, it's not hard to find cases of partially ordered data where "pairing heap sort" does fewer compares than our new mergesort. OTOH, PHs do substantially worse than classic heaps on # of compares when data is fed in randomly, and classic heaps in turn do substantially worse on random data than our mergesort. Still, if there are bursts of order in your data, and you can afford the space, using a PH priority queue can be much faster than using a classic heap. Indeed, if you feed the numbers from 1 through N in *reverse* order, then pop them off one at a time, it turns out that the PH queue doesn't need to compare after any of the pops -- the N-1 compares at the start are the whole banana. Have fun! def _link(x, y): if x[0] <= y[0]: x.append(y) return x else: y.append(x) return y def _merge(x): n = len(x) if n == 1: return [] pairs = [_link(x[i], x[i+1]) for i in xrange(1, n-1, 2)] if n & 1 == 0: pairs.append(x[-1]) x = pairs[-1] for i in xrange(len(pairs)-2, -1, -1): x = _link(pairs[i], x) return x class Heap(object): __slots__ = 'x' def __init__(self): self.x = [] def __nonzero__(self): return bool(self.x) def push(self, value): if self.x: self.x = _link(self.x, [value]) else: self.x.append(value) def pop(self): result = self.x[0] # raises IndexError if empty self.x = _merge(self.x) return result
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4