On Wed, Sep 08, 1999 at 05:37:38PM +0200, Hrvoje Niksic wrote: > As a Python exercise, I wrote a simple program to "scratch an itch", > i.e. do something useful. However, I found that Python's lack of > speed really bytes me here, so I'd like to hear suggestions for > speedup. People who don't like that kind of topic, please skip to the > following article. Others, read on. ... > The program was quite easy to write, and easy to read afterwards. The > problem is that it is also quite slow. On my system, it takes about > 27 CPU seconds (as reported by `time' shell builtin) to do the work, > which can extend to more than a minute of real time, depending on the > system load. > > As a comparison, the equivalent Perl program does the same thing in 9 > CPU seconds. I tried everything I knew to make the Python version > fast. I tried to use `re' to avoid returning headers other than the > ones we're interested in. I tried changing self.__current to just > current to avoid a dictionary lookup. I tried to make self.__current > a list, to avoid the expensive `current = current + line' operation. > All of these things made the program measure slower. > > I would really appreciate some suggestions. The code is not large, > and is (I hope) rather elegant. I am a Python beginner, so I'd also > appreciate tips on Python style and OO technique. I'll post/mail the > Perl equivalent on demand. Hi, the following code is about five times faster here. That means it's faster than perl, I suppose :-). It is of course not quite as general as your's, but it seems to fit the job nicely. For this kind of problem, I usually don't bother with OO. The main speed advantage seems to be that the files aren't processed line by line, which is of course very memory consuming. HTH, Robert #! /usr/bin/env python import string def get_installed(): fstatus = open('/var/lib/dpkg/status', 'r') status = fstatus.read() fstatus.close() status = string.split(status, '\n\n') installed = [] for package in status: fields = string.split(package, '\n') name = fields[0][9:] for line in fields: if line[:7] == 'Status:': if string.split(line[8:])[-1] == 'installed': installed.append(name) break return installed def get_sizes(packages): favailable = open('/var/lib/dpkg/available', 'r') available = favailable.read() favailable.close() available = string.split(available, '\n\n') results = [] for package in available: fields = string.split(package, '\n') name = fields[0][9:] if name in packages: for line in fields: if line[:15] == 'Installed-Size:': results.append(name, int(line[16:])) break return results def main(): results = get_sizes(get_installed()) results.sort(lambda a, b: cmp(b[1], a[1])) for r in results: print '%s: %d' % r if __name__ == '__main__': main() -- Robert Vollmert rvollmert at gmx.net
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4