[Barry Warsaw] > Except that in > > http://mail.python.org/pipermail/python-dev/2002-July/026837.html > > Tim says: > > "Back on Earth, among Python users the most frequent complaint > I've heard is that list.sort() isn't stable." Yes, and because the current samplesort falls back to a stable sort when lists are small, almost everyone who cares about this and tries to guess about stability via trying small examples comes to a wrong conclusion. > and here > > http://mail.python.org/pipermail/python-dev/2002-July/026854.html > > Tim seems <wink> to be arguing against stable sort as being the > default due to code bloat. I'm arguing there against having two highly complex and long-winded sorting algorithms in the core. Pick one. In favor of samplesort: + It can be much faster in very-many-equal-elements cases (note that ~sort lists have only 4 distinct values, each repeated N/4 times and spread uniformaly across the whole list). + While it requires some extra memory, that lives on the stack and is O(log N). As a result, it can never raise MemoryError unless a comparison function does. + It's never had a bug reported against it (so is stable in a different sense <wink>). In favor of timsort: + It's stable. + The code is more uniform and so potentially easier to grok, and because it has no random component is easier to predict (e.g., it's certain that it has no quadratic-time cases). + It's incredibly faster in the face of many more kinds of mild disorder, which I believe are very common in the real world. As obvious examples, you add an increment of new data to an already- sorted file, or paste together several sorted files. timsort screams in those cases, but they may as well be random to samplesort, and the difference in runtime can easily exceed a factor of 10. A factor of 10 is a rare and wonderful thing in algorithm development. Against timsort: + It can require O(N) temp storage, although the constant is small compared to object sizes. That means it can raise MemoryError even if a comparison function never does. + Very-many-equal-elements cases can be much slower, but that's partly because it *is* stable, and preserving the order of equal elements is exactly what makes stability hard to achieve in a fast sort (samplesort can't be made stable efficiently). > As Tim's Official Sysadmin, I'm only good at channeling him on one > subject, albeit probably one he'd deem most important to his life: > lunch. So I'm not sure if he's arguing for or against stable sort > being the default. ;) All else being equal, a stable sort is a better choice. Alas, all else isn't equal. If Python had no sort method now, I'd pick timsort with scant hesitation. Speaking of which, is it time for lunch yet <wink>?
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4