A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://mail.python.org/pipermail/python-list/2010-March/720576.html below:

Sort Big File Help

Sort Big File Helpmk mrkafk at gmail.com
Wed Mar 3 16:52:35 EST 2010
MRAB wrote:

> [snip]
> Simpler would be:
> 
>     lines = f.readlines()
>     lines.sort(key=lambda line: line[ : 3])
> 
> or even:
> 
>     lines = sorted(f.readlines(), key=lambda line: line[ : 3]))

Sure, but a complete newbie (I have this impression about OP) doesn't 
have to know about lambda.

I expected my solution to be slower, but it's not (on a file with 
100,000 random string lines):

# time ./sort1.py

real    0m0.386s
user    0m0.372s
sys     0m0.014s

# time ./sort2.py

real    0m0.303s
user    0m0.286s
sys     0m0.017s


sort1.py:

#!/usr/bin/python

def sortit(fname):
     lines = open(fname).readlines()
     lines.sort(key = lambda x: x[:3])

if __name__ == '__main__':
     sortit('testfile.txt')



sort2.py:

#!/usr/bin/python

def sortit(fname):
     fo = open(fname)
     linedict = {}
     for line in fo:
         key = line[:3]
         linedict[key] = line
     sortedlines = []
     keys = linedict.keys()
     keys.sort()
     for key in keys:
         sortedlines.append(linedict[key])
     return sortedlines

if __name__ == '__main__':
     sortit('testfile.txt')


Any idea why? After all, I'm "manually" doing quite a lot: allocating 
key in a dict, then sorting dict's keys, then iterating over keys and 
accessing dict value.

Regards,
mk


More information about the Python-list mailing list

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4