In article <17342817.1052002018@[10.0.1.2]>, David Eppstein <eppstein@ics.uci.edu> wrote: > On the other hand, if you really want to find the n best items in a data > stream large enough that you care about using only space O(n), it might > also be preferable to take constant amortized time per item rather than the > O(log n) that heapq would use, and it's not very difficult nor does it > require any fancy data structures. Some time back I needed some Java code > for this, haven't had an excuse to port it to Python. In case anyone's > interested, it's online at > <http://www.ics.uci.edu/~eppstein/161/KBest.java>. BTW, the central idea here is to use a random quicksort pivot to shrink the list, when it grows too large. In python, this could be done without randomization as simply as def addToNBest(L,x,N): L.append(x) if len(L) > 2*N: L.sort() del L[N:] It's not constant amortized time due to the sort, but that's probably more than made up for due to the speed of compiled sort versus interpreted randomized pivot. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4