While fuzz-testing the Python-Markdown library (version 3.8) with the extra
extension enabled, I found that certain malformed inputs containing <![
sequences cause the parser to throw uncaught exceptions and crash. These inputs appear to break the parser's handling of XML-style marked sections, leading to errors like "expected name token" or "unknown status keyword."
extra
extension):import markdown print(markdown.__version__) crash_inputs = [ "<![", "<![>og))/uw_ f{tv+pAr$Ss+[6;^{=<:>g2oV|.pdTMu(Q-E#", "<![ g'\"7z5r7cojSO;2LAo0(1Vv5G>,-P", ] for i, crash_input in enumerate(crash_inputs, 1): print(f"\nTesting crash input #{i}:\n{repr(crash_input)}") try: output = markdown.markdown(crash_input, extensions=["extra"]) print("No crash, output:") print(output) except Exception as e: print(f"Crash confirmed! Exception:\n{e}")
The parser should gracefully handle or sanitize malformed <![
marked sections without raising uncaught exceptions.
Uncaught exceptions are raised, such as:
3.8 Testing crash input #1: '<![' Crash confirmed! Exception: expected name token at '<![\n\n' Testing crash input #2: '<![>og))/uw_ f{tv+pAr$Ss+[6;^{=<:>g2oV|.pdTMu(Q-E#' Crash confirmed! Exception: expected name token at '<![>og))/uw_ f{t' Testing crash input #3: '<![ g\'"7z5r7cojSO;2LAo0(1Vv5G>,-P' Crash confirmed! Exception: expected name token at '<![ g\'"7z5r7cojSO;2L'
expected name token at '<![\n\n'
expected name token at '<![>og))/uw_ f{t'
expected name token at '<![ g\'"7z5r7cojSO;2L'
I performed fuzz testing and isolated these inputs causing crashes. I'm happy to provide more test cases or logs if needed.
Thank you for your attention to this issue!
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4