[Guido] > But is everyone's first thought to time the speed of Python vs. Perl? It's few peoples' first thought. It's impossible for bilingual programmers (or dabblers, or evaluators) not to notice *soon*, though, because: > Why does it hurt so much that this is a bit slow? Factors of 2 to 5 aren't "a bit" -- they're obvious when they happen, but the *cause* is not. To judge from a decade of c.l.py gripes, most people write it off to "huh -- guess Python is just slow"; the rest eventually figure out that their text input is the bottleneck (Tom Christiansen never got this far <0.5 wink>), but then don't know what to do about it. At this point I'm going to insert two anonymized pvt emails from last year: -----Original Message #1 ----- From: TTT Sent: Monday, March 13, 2000 2:29 AM To: GGG Subject: RE: [Python-Help] C, C++, Java, Perl, Python, Rexx, Tcl comparison GGG, note especially figure 4 in Lutz Prechelt's report: > http://wwwipd.ira.uka.de/~prechelt/Biblio/#jccpprtTR The submitted Python programs had by far the largest variability in how long it took to load the dictionary. My input loop is probably typical of the "fast" Python programs, which indeed beat most (but not all) of the fastest Perl ones here: class Dictionary: ... def fill_from_file(self, f, BUFFERSIZE=500000): """f, BUFFERSIZE=500000 -> fill dictionary from file f. f must be an open file, or other object with a readlines() method. It must contain one word per line. Optional arg BUFFERSIZE is used to chunk up input for efficiency, and is roughly the # of bytes read at a time. """ addword = self.addword while 1: lines = f.readlines(BUFFERSIZE) if not lines: break for line in lines: addword(line[:-1]) # chop trailing newline Comparable Perl may have been the one-liner: grep(&addword, chomp(<>)); which may account for why Perl's memory use was uniformly higher than Python's. Whatever, you really need to be a Python expert to dream up "the fast way" to do Python input! Hire me, and I'll fix that <wink>. nothing-like-blackmail-before-going-to-bed-ly y'rs - TTT -----Original Message #2 ----- From: GGG Sent: Monday, March 13, 2000 7:08 AM To: TTT Subject: Re: [Python-Help] C, C++, Java, Perl, Python, Rexx, Tcl comparison Agreed. readlines(BUFFERSIZE) is a crock. In fact, ``for i in f.readlines()'' should use lazy evaluation -- but that will have to wait for Py3K unless we add hints so that readlines knows it is being called from a for loop. --GGG -----Back to 2001 ----- I took TTT's advice and read Lutz's report <wink>. I agree with GGG that hiding this in .readlines() would be maximally elegant. xreadlines supplies most of the lazy machinery GGG favored. I don't know how hard it would be to supply the rest of it, but it's such a frequent bitching point that I would prefer pointing people to an explicit .xreadlines() hack than either (a) try to convince them that they "shouldn't" care about the speed as much as they claim to; or, (b) try to explain the double-loop buffering method. I'd personally rather use an explicit .xreadlines() hack than code the double-loop buffering too, and don't see an obvious way to do better than that right now. >> reading-text-files-is-very-common-ly y'rs - tim > So is worrying about performance without a good reason... Indeed it is. I'm persuaded that many people making this specific complaint have a legitimate need for more speed, though, and that many don't persist with Python long enough to find out how to address this complaint (because the double-loop method is too obscure for a newbie to dream up). That makes this hack score extraordinarily high on my benefit/harm ratio scale (in P3K xreadlines can be deprecated in favor of readlines <0.9 wink>). heck-it-doesn't-even-require-a-new-keyword-ly y'rs - tim
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4