On 3/12/2011 3:44 PM, Guido van Rossum wrote: > I was just reminded that in Python 3, list.sort() and sorted() no > longer support the cmp (comparator) function argument. The reason is > that the key function argument is always better. But now I have a > nagging doubt about this: > > I recently advised a Googler who was sorting a large dataset and > running out of memory. My analysis of the situation was that he was > sorting a huge list of short lines of the form "shortstring,integer" > with a key function that returned a tuple of the form ("shortstring", > integer). I believe that if the integer field were padded with leading blanks as needed so that all are the same length, then no key would be needed. ll = ['a,11111', 'ab, 3', 'a, 1', 'a, 111'] ll.sort() print(ll) >>> ['a, 1', 'a, 111', 'a,11111', 'ab, 3'] If most ints are near the max len, this would add little space, and be even faster than with the key. > Using the key function argument, in addition to N short > string objects, this creates N tuples of length 2, N more slightly > shorter string objects, and N integer objects. (Not to count a > parallel array of N more pointers.) Given the object overhead, this > dramatically increased the memory usage. It so happens that in this > particular Googler's situation, memory is constrained but CPU time is > not, and it would be better to parse the strings over and over again > in a comparator function. Was 3.2 used? It has a patch that reduces the extra memory that might not be in the last 3.1 release. > But in Python 3 this solution is no longer available. How bad is that? > I'm not sure. But I'd like to at least get the issue out in the open. This removal has been one of the more contentious issues about (not) using 3.x. I believe Raymond had been more involved in the defense of the decision than I. However, the discussion/complaint has mostly been about the relative difficulty of writing a key function versus a compare function. -- Terry Jan Reedy
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4