[Gordon McMillan] > mxTextTools lets (encourages?) you to break all > the rules about lex -> parse. If you can (& want to) > put a good deal of the "parse" stuff into the scanning > rules, you can get a speed advantage. You're also > not constrained by the rules of BNF, if you choose > to see that as an advantage :-). > > My one successful use of mxTextTools came after > using SPARK to figure out what I actually needed > in my AST, and realizing that the ambiguities in the > grammar didn't matter in practice, so I could produce > an almost-AST directly. I don't expect anyone will have much luck writing a fast lexer using mxTextTools *or* Python's regexp package unless they know quite a bit about how each works under the covers, and about how fast lexing is accomplished by DFAs. If you know both, you can build a DFA by hand and painfully instruct mxTextTools in the details of its construction, and get a very fast tokenizer (compared to what's possible with re), regardless of the number of token classes or the complexity of their definitions. Writing to mxTextTools directly is a lot like writing in an assembly language for a character-matching machine, with all the pains and potential joys that implies. If I were Eric, I'd use Flex <wink>.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4