On 3/12/2011 8:47 PM, Glenn Linderman wrote: > On 3/12/2011 2:09 PM, Terry Reedy wrote: >> I believe that if the integer field were padded with leading blanks as >> needed so that all are the same length, then no key would be needed. > > Did you mean that "if the integer field were" converted to string and > "padded with leading blanks..."? Guido presented a use case of a list a strings, each of form '%s,%d', where the %s part is a 'word'. 'Integer field' refers to the part of each string after the comma. > Otherwise I'm not sure how to pad an integer with leading blanks. The integers are already in string form. The *existing* key function his colleague used converted that part to an int as the second part of a tuple. I presume the integer field was separated by split(','), so the code was something like def sikey(s): s,i = s.split(',') return s,int(i) longlist.sort(key=sikey) It does not matter if the splitting method is more complicated, because it is already part of the problem spec. I proposed instead def sirep(s): s,i = s.split(',') # or whatever current key func does return '%s,%#s' % (s,i) # where appropriate value of # is known from application longlist = map(sirep, longlist) longlist.sort() # or assuming that a simple split is correct longlist = ['%s,%#s' % tuple(s.split(',')) for s in longlist] longlist.sort() > Also, what appears to be your revised data structure, strval + ',' + > '%5.5d' % intval , assumes the strval is fixed length, also. No it does not, and need not. ',' precedes all letters in ascii order. (Ok, I assumed that the 'word' field does not include any of !"#$%&'()*+. If that is not true, replace comma with space or even a control char such as '\a' which even precedes \t and \n.) Given the context of Google, I assumed that 'word' meant word, as in a web document, while the int might be a position or doc number (or both). The important point is that the separator cause all word-int pairs with the same word to string-sort before all word-int pairs with the same word + a suffix. My example intentionally tested that. >Consider the following strval, intval pairs, using your syntax: > > ['a,997, 1','a, 1000'] > > Nothing says the strval wouldn't contain data that look like your > structure... The problem as presented. 'a,997' is not a word. In any case, as I said before, the method of correctly parsing the strings into two fields is already specified. I am only suggesting a change in how to proceed thereafter. -- Terry Jan Reedy
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4