A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://phabricator.wikimedia.org/T16779 below:

⚓ T16779 {{#ifexist}} does not recognise URL encoded titles

Event Timeline bzimport

added a subscriber:

Unknown Object (MLST)

.

Comment Actions

rene.kijewski wrote:

Maybe the underlaying Sanitizer::decodeCharReferences could be changed?
{{titleparts}} is even worse affected by the bug: It only returns the input.
I tried to provide a patch, but I got confused with the mathematics of converting URL encode to UTF-8. ;-)

I don't know if someone needs might it, but this it the regex for proper URL encoded characters:
/(%[0-7][0-9A-Za-z])

(%[CDcd][0-9A-Za-z]%[89ABab][0-9A-Za-z]) (%[Ee][0-9A-Za-z](?:%[89ABab][0-9A-Za-z]){2}) (%[Ff][0-7](?:%[89ABab][0-9A-Za-z]){3})/x
Comment Actions

Sanitizer::decodeCharReferences *must not* attempt to deal with URL percent-encoding, as that would cause corruption of totally unrelated HTML output.

Probably the Sanitizer::decodeCharReferences() and the %-check & urldecode() both belong in either Title:newFromText or directly into Title::secureAndSplit() to ensure that titles are being consistently handled at the low-level; this means the various checks at higher levels should be checked and mostly pulled out.

There are probably a number of related bugs still open on this issue; be good to make sure they're all tied together.

Comment Actions

fullurl has no problem with urlencoded pages:

{{fullurl:{{FULLPAGENAME}}}} and {{fullurl:{{FULLPAGENAMEE}}}} gives the same result.

CoreParserFunctions.php has the description - since r15276
$title = Title::newFromText( $s );

  1. Due to order of execution of a lot of bits, the values might be encoded
  2. before arriving here; if that's true, then the title can't be created
  3. and the variable will fail. If we can't get a decent title from the first
  4. attempt, url-decode and try for a second.

if( is_null( $title ) )

$title = Title::newFromUrl( urldecode( $s ) );

#ifexist has not this and only try to get the title from text and not from URL second.

I do not know this is a possible solution. I am not so familiar with MediaWiki or PHP to say, this is good or not. I can only give you a place, who this works. Maybe you can adapt from there.

Comment Actions

I don't see why this is a bug. It doesn't recognise URL-encoded file names because the parameter is the file name, not the URL. I don't see the need to take MediaWiki apart and put it back together again when you could just fix your broken #ifexist calls.

Comment Actions

This is not only a problem of file names. I have add a example (see url).

Comment Actions

Rereading this, I have to concur with comment 6. This must have blocked some use case of mine in the past so I CC'd but I don't remember it and neither do the other comments explain why it is a bug from their points of view. I propose closing as INVALID.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4