Small discussion and evaluation of different parsers.
Please keep wiki links as wiki links, use external links only if there is no existing page for the tool.
Name
Grammar
Module
Python
Comment
C
included in the main Python distribution
Python
2.7+, 3.3+, PyPy
Tool that takes grammars in EBNF variant & and outputs memoizing (Packrat) PEG parsers in Python. Grako is different from other PEG parser generators in that the generated parsers use Python's very efficient exception-handling system to backtrack.
Python/Regex
2.x, 3.x
Combines Regular Expressions
C
lexical analysis module for Python, foundation for Pyrex and Cython. Plex 2.0.0 is Python 2 only, the version embedded in Cython works in Python 3.x. There is also an experimental port to Python 3 (tested on Python 3.3)
Earley Parser
Python
2.3+, 3.2+, PyPy
PyPI package; github project. This parser is notably used in decompilers like uncompyle6 where using an ambigous grammar is desirable.
LL(1)
Python
1-any, 2-1.5+
LR(1) LALR(1)
C
C
bison grammar with python code actions
LR
1.5.1+
SLR LALR(1)
Python
Python Lex-Yacc
2.2+
GLR
C
2.2+
grammar in doc strings
GLR
Python
2.2.1
PEG
Python
2.5+
PEG
Python
2.5+
-
2.0+
requires mxTextTools
Python
2.0+
requires mxTextTools
-
C
is not exactly a parser like we're used to, but it is a fast text-processing engine
Python
2.2+
Python
2.6+
Parser combinator library, similar to pyparsing
LL1+
Python
stand-alone tool in Java. Latest version can produce Python code
LR(0) LR(1) SLR LALR(1)
Python
2.2+
Python
Object-oriented, Pythonic parsing
LR(1)
Python
2.5+
LL(1)
Python
uses separate grammar files
Python
inspired by pyparsing and boost::spirit
LR(1)
Python
2.4+
has separate parser input file, parser output is a parse tree
na
Python
2.6+
Simple parser using rule defined in BNF format
Any
Python
2.6+,3+
Recursive descent with full backtracking and optional memoisation (which can handle left recursive grammars). So equivalent to GLR, but based on LL(k) core.
GLR
Python
3.1+
Recursive descent parser with full backtracking. Grammar elements and results are defined as Python classes, so are fully customizable. Supports ambiguous grammars.
LL(*)
Python
2.4+
Recursive descent parsing library for Python based on functional combinators
-
Python
2.7+ 3+
LR(1)
Python
2.6+
A fast parser, lexer combination with a concise Pythonic interface. Lots of documentation, include example parsers for SQL and Lua.
PEG
Python
2.7+, 3.2+
Packrat parser. Works as interpreter. Multiple syntaxes for grammar definition. Lots of docs, examples and tutorials.
Python
2.7+, 3.2+
A high-level meta-language/parser for Domain-Specific Language implementation. Built on top of Arpeggio parser. Inspired by XText. Documentation, examples and tutorials available.
LR
Python
3.2+
A fast, stand-alone parser which can export a grammar to JavaScript (jsleri), Go (goleri), C (libcleri) or Java (jleri).
LR/GLR
Python
2.7+, 3.3+
A pure Python LR/GLR parser with integrated scanner (scannerless). Grammar in BNF format. Automata/GLR trace visualization. Full documentation and examples available.
LALR(1), CFG
Python
2.7, 3.4+
LALR(1) for speed or Earley parser for any context-free grammar.
For faster performance, one may use other parser generator systems and plug them in as modules.
For example:
Spirit (http://spirit.sourceforge.net/) framework for writing EBNF as C++ code
FlexBisonModule (http://www.crsr.net/Software/FBModule.html)
cocktail compiler tools approach
Example of such usage is SeeGramWrap available from Edward C. Jones Python page, which is a heavily revised and upgraded version of the ANTLR C parser that is in cgram (broken link). The lastest verson has been refactored to move some of the complexity from ANTLR to Python.
Martin von Loewis presented a paper at Python10, titled "Towards a Standard Parser Generator" that surveyed the available parser generators for Python.
Additional information on these and other parsers at Python Parsing Tools.
BooksComplete online textbook, titled "Parsing: A Practical Guide".
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4