A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://phabricator.wikimedia.org/T37746 below:

⚓ T37746 {{PAGENAME}} must not escape special chars, otherwise it makes {{#ifeq:}} unusable

The safest way to compare page names is to pass them BOTH through {{PAGENAMEE|pagename}}, or BOTH to {{PAGENAMEE|pagename}}. If you want to also compare their namespaces, pass both pagenames in parameter to {{FULLPAGENAME|pagename}} so that the given pagename won't have its namespace parsed and removed.

Note that these functions will also resolve relative paths in subpages and FULLPAGENAME(E) will also resolve the namespace.

So:

{{#ifeq: {{PAGENAME}}|Q & A|true|false}}

will always be false on every page, but the following will work:

{{#ifeq: {{PAGENAME}}|{{PAGENAME|Q & A}}|true|false}}

as it will return "true" on the expected page.

With full page names where you also check the namespace:

{{#ifeq: {{FULLPAGENAME}}|{{FULLPAGENAME|Q & A}}|true|false}}

will also return true but only in the main namespace (it will be false on a Category page named "Category:Q & A", because the second parameter of "#if" gets the full page name of page "Q & A" in te main namespace).

In summary:

There's NO function in MediaWiki that returns the raw pagename.

But note:

{{(FULL|BASE|SUB)PAGENAMEE|...}}

is also different from

{{URLENCODE:{{(FULL|BASE|SUB)PAGENAME|...}}}}

Because in the later case, URLENCODE will take in parameter an HTML-encoded name, so the result will be double-encoded, where HTML entities (containing the character & # ;) and SPACEs will be URL-encoded using %nn and +.

But in the first case the MediaWiki-specific URL-encoding performed by PAGENAMEE is different than standard URL-encoding (it does not generate "+" for spaces, but generates underscores).

So:

  1. "{{PAGENAMEE|Q & A}}" returns in fact "Q_%26_A"
  2. "{{PAGENAME|Q & A}}" returns in fact "Q & A"
  3. "{{URLENCODE:{{PAGENAME|Q & A}}}}" returns in fact at least this: "Q+%26%2338;+A" I don't know if URLENCODE also recodes the semicolon, if so the result will be instead: "Q+%26%2338%2B+A" In all cases this will be different from the result of case 1 !!!

This strange behavior means that there are some characters "permitted" in URLs to MediaWiki sites that are transformed in a fery strange way, such as:

  1. http://www.mediawiki.org/wiki/Q & A

    not directly a valid URL, but the browser transforms it to URL-encoding of UTF-8 and requests:

    http://www.mediawiki.org/wiki/Q%20&%20A

    the server all accept to load the page name "Q & A"

  1. http://www.mediawiki.org/wiki/Q+%26%2338%2B+A

    the server parses this URL as containing an URL-encoded pagename, so it first URL-decodes it as:

    Q & A

    the server will then parse the URL and will think it contains an anchor, it will attempt to load a page named only "Q &", with the anchor "38; A" dropped !

  1. Valid page names may contain isolated ampersand or ampersands ad valdi characters in pagenames (internally they are HTML-encoded if you query their {{PAGENAME}}) but some sequences will generate errors,

such as "&", but "a amp;" will be accepted...

All this is completely inconsistant, but this time this does not occur in parser functions, but at the server API level when handling incoming HTTP(S) requests that may, or may not, be HTML-encoded, when the HTTP-standard says that URLs should be ONLY URL-encoded ! The server also performs such double-decoding when resolving requests.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.3