An enormous amount of research has been done on sorting since the last time I wrote a sort for Python. Major developments have been in two areas: 1. Adaptive sorting. Sorting algorithms are usually tested on random data, but in real life you almost never see random data. Python's sort tries to catch some common cases of near-order via special- casing. The literature has since defined more than 15 formal measures of disorder, and developed algorithms provably optimal in the face of one or more of them. But this is O() optimality, and theoreticians aren't much concerned about how big the constant factor is. Some researchers are up front about this, and toward the end of one paper with "practical" in its title, the author was overjoyed to report that an implementation was only twice as slow as a naive quicksort <wink>. 2. Pushing the worst-case number of comparisons closer to the information-theoretic limit (ceiling(log2(N!))). I don't care much about #2 -- in experiments conducted when it was new, I measured the # of comparisons our samplesort hybrid did on random inputs, and it was never more than 2% over the theoretical lower bound, and typically closer. As N grows large, the expected case provably converges to the theoretical lower bound. There remains a vanishly small chance for a bad case, but nobody has reported one, and at the time I gave up trying to construct one. Back on Earth, among Python users the most frequent complaint I've heard is that list.sort() isn't stable. Alex is always quick to trot out the appropriate DSU (Decorate Sort Undecorate) pattern then, but the extra memory burden for that can be major (a new 2-tuple per list element costs about 32 bytes, then 4 more bytes for a pointer to it in a list, and 12 more bytes that don't go away to hold each non-small index). After reading all those papers, I couldn't resist taking a crack at a new algorithm that might be practical, and have something you might call a non-recursive adaptive stable natural mergesort / binary insertion sort hybrid. In playing with it so far, it has two bad aspects compared to our samplesort hybrid: + It may require temp memory, up to 2*N bytes worst case (one pointer each for no more than half the array elements). + It gets *some* benefit for arrays with many equal elements, but not nearly as much as I was able to hack samplesort to get. In effect, paritioning is very good at moving equal elements close to each other quickly, but merging leaves them spread across any number of runs. This is especially irksome because we're sticking to Py_LT for comparisons, so can't even detect a==b without comparing a and b twice (and then it's a deduction from that not a < b and not b < a). Given the relatively huge cost of comparisons, it's a timing disaster to do that (compare twice) unless it falls out naturally. It was fairly natural to do so in samplesort, but not at all in this sort. It also has good aspects: + It's stable (items that compare equal retain their relative order, so, e.g., if you sort first on zip code, and a second time on name, people with the same name still appear in order of increasing zip code; this is important in apps that, e.g., refine the results of queries based on user input). + The code is much simpler than samplesort's (but I think I can fix that <wink>). + It gets benefit out of more kinds of patterns, and without lumpy special-casing (a natural mergesort has to identify ascending and descending runs regardless, and then the algorithm builds on just that). + Despite that I haven't micro-optimized it, in the random case it's almost as fast as the samplesort hybrid. In fact, it might have been a bit faster had I run tests yesterday (the samplesort hybrid got sped up by 1-2% last night). This one surprised me the most, because at the time I wrote the samplesort hybrid, I tried several ways of coding mergesorts and couldn't make it as fast. + It has no bad cases (O(N log N) is worst case; N-1 compares is best). Here are some typical timings, taken from Python's sortperf.py, over identical lists of floats: Key: *sort: random data \sort: descending data /sort: ascending data 3sort: ascending data but with 3 random exchanges ~sort: many duplicates =sort: all equal !sort: worst case scenario That last one was a worst case for the last quicksort Python had before it grew the samplesort, and it was a very bad case for that. By sheer coincidence, turns out it's an exceptionally good case for the experimental sort: samplesort i 2**i *sort \sort /sort 3sort ~sort =sort !sort 15 32768 0.13 0.01 0.01 0.10 0.04 0.01 0.11 16 65536 0.24 0.02 0.02 0.23 0.08 0.02 0.24 17 131072 0.54 0.05 0.04 0.49 0.18 0.04 0.53 18 262144 1.18 0.09 0.09 1.08 0.37 0.09 1.16 19 524288 2.58 0.19 0.18 2.34 0.76 0.17 2.52 20 1048576 5.58 0.37 0.36 5.12 1.54 0.35 5.46 timsort 15 32768 0.16 0.01 0.02 0.05 0.14 0.01 0.02 16 65536 0.24 0.02 0.02 0.06 0.19 0.02 0.04 17 131072 0.55 0.04 0.04 0.13 0.42 0.04 0.09 18 262144 1.19 0.09 0.09 0.25 0.91 0.09 0.18 19 524288 2.60 0.18 0.18 0.46 1.97 0.18 0.37 20 1048576 5.61 0.37 0.35 1.00 4.26 0.35 0.74 If it weren't for the ~sort column, I'd seriously suggest replacing the samplesort with this. 2*N extra bytes isn't as bad as it might sound, given that, in the absence of massive object duplication, each list element consumes at least 12 bytes (type pointer, refcount and value) + 4 bytes for the list pointer. Add 'em all up and that's a 13% worst-case temp memory overhead.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4