RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/Python-Markdown/markdown/issues/1493 below:

Unusual characters in heading ids not well supported · Issue #1493 · Python-Markdown/markdown · GitHub

I noticed that toc encodes characters like * as \x0242\x03, 42 being the index of * in the ASCII table. This causes a discrepancy between the permalink of a heading and the link in the table of contents.

mkdir /tmp/toc
cd /tmp/toc
python -m venv .venv
. .venv/bin/activate
python -m pip install mkdocs
python -m mkdocs new .

index file:

# Welcome

Demonstrating an issue with HTML ids and `toc`.

## `*Foo*` { id="\*Foo\*" }

- Click on `*Foo*`'s permalink: `#*Foo*` in the URL.
- Click on `*Foo*` in the table of contents: `#%0242%03Foo%0242%03` in the URL.

mkdocs config:

site_name: My Docs

markdown_extensions:
- attr_list
- toc:
    permalink: true

Serve and observe the behavior described in the index page.

I'm not saying this is a bug. I'm just curious if this is expected, and whether there would be a way improve support for headings with such "unusual" ids. This would help for the work I'm doing with mkdocstrings, where we try to expand our languages support, and some languages might use uncommon characters in object identifiers. Not only toc would have to work, but also mkdocs-autorefs, which picks up ids from the table of contents when registering URLs and anchors to objects.

I believe HTML5 supports any kind of characters in ids. Some of them just cause a bit of pain, like . or #, because they then need to be escaped in CSS selectors.

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4