On Thu, 25 May 2006 15:01:36 +0000, Runar Petursson <runar at runar.net> wrote: >We've been talking this week about ideas for speeding up the parsing of >Longs coming out of files or network. The use case is having a large string >with embeded Long's and parsing them to real longs. One approach would be >to use a simple slice: >long(mystring[x:y]) > >an expensive operation in a tight loop. The proposed solution is to add >further keyword arguments to Long (such as): > >long(mystring, base=10, start=x, end=y) > >The start/end would allow for negative indexes, as slices do, but otherwise >simply limit the scope of the parsing. There are other solutions, using >buffer-like objects and such, but this seems like a simple win for anyone >parsing a lot of text. I implemented it in a branch runar-longslice- >branch, >but it would need to be updated with Tim's latest improvements to long. >Then you may ask, why not do it for everything else parsing from string--to >which I say it should. Thoughts? This really seems like a poor option. Why fix the problem with a hundred special cases instead of a single general solution? Hmm, one reason could be that the general solution doesn't work: exarkun at kunai:~$ python Python 2.4.3 (#2, Apr 27 2006, 14:43:58) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> long(buffer('1234', 0, 3)) Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: null byte in argument for long() >>> long(buffer('123a', 0, 3)) Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: invalid literal for long(): 123a >>> Still, fixing that seems like a better idea. ;) Jean-Paul
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4