On Fri, Jul 13 2001 "Fredrik Lundh" wrote: > sjoerd wrote: > > > This is not for the faint of heart. > > > > My validating XML parser doesn't work anymore, even though I didn't > > change a thing (except update Python from CVS). > > when did you last update without problems? I have no idea. I update regularly (only on the main branch), but I don't run the program very often. > the likely cause for this is MvL's "big char set" patch, which > I checked in on July 6. > > here's a workaround: tweak sre_compile.py so it doesn't generate > BIGCHARSET op codes. in _optimize_charset, change this: > > except IndexError: > # character set contains unicode characters > return _optimize_unicode(charset, fixup) > # compress character map > > to > > except IndexError: > # character set contains unicode characters > return charset # WORKAROUND: no compression > # compress character map > > I'll look into this over the weekend. Yes, this works. While you're looking at this, maybe you can also look at speeding up stuff? :-) Importing the module with my XML parser takes an inordinate amount of time. This is entirely due to compiling all the regular expressions. There are a lot of them, and since many of them use the _Name pattern that I included in my previous message, they tend to be big. Unfortunately, I can't use any abbreviations that re might provide for Unicode character sets, since then I don't know for sure that my expressions are compatible with the XML definition. Maybe it's possible to add a way of saving precompiled expressions in the Python file? -- Sjoerd Mullender <sjoerd.mullender@oratrix.com>
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4