A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://phabricator.wikimedia.org/T223772 below:

⚓ T223772 Extend #time parser function to display time in format specific to each language

Parser function #time is able to show dates in different languages by translating month, etc. However each language prefers dates to be displayed in in a different format / order and that part is not handled at the moment by that function. That niche is filled on Commons by Module:DateI18n used on 50M pages and which was transplanted to many other projects, see Q56528363 . The capabilities of that module should be moved to #time parser function (or to some new parser function, if more convenient), since it is a basic functionality, which should be handled by MediaWiki software uniformly on all the wikis instead of local clone of a module.

Event Timeline Comment Actions

Added I18n since this is an issue when localizing global messages (central notices, tech news, etc.), but I don’t think it is a real code internationalization issue since there is Language::userTimeAndDate() which solves it in code.

Comment Actions

Goals:

For performance, it's not ideal to give access to all languages, although I suppose it could be done if there were a use case for it.

Options:

  1. A family of parser functions supply formats
    • {{#time:{{#dateformat:}}|now}}, {{#time:{{#timeformat:}}|now}}, {{#time:{{#timeanddateformat:}}|now}}, {{#time:{{#timeanddateformat:user}}|now}}
  2. A single parser function to supply formats
    • {{#time:{{#dateformat:time}}|now}}, {{#time:{{#dateformat:date}}|now}}, {{#time:{{#dateformat:both|user}}|now}}
    • Need localisation for special parameter values
  3. A family of magic words supply formats
    • {{#time:{{DATEFORMAT}}|now}}, {{#time:{{USERDATEFORMAT}}|now}}, {{#time:{{USERTIMEANDDATEFORMAT}}|now}}
  4. Extend date formats to provide a symbolic replacement syntax
    • {{#time:%date%|now}}, {{#time:%both%|now}}, {{#time:%user-both%|now}}
    • Need localisation for the symbolic strings?
  5. A family of parser functions which format dates without a format parameter
    • {{#date:now}}, {{#timeanddate:now}}, {{#timeonly:now}}
  6. A single parser function to format dates with an optional special parameter, like option 2 but composed
    • {{#timef:now}} {{#timef:now|both}}, {{#timef:now|time|user}}

Let's explore option 6 in code.

Comment Actions

#time allows the user to specify an arbitrary language code, so I may as well extend that feature to the new function. The target keyword is not really necessary, since omitting the parameter would have the same effect. I can split the user keyword out to a separate commit.

T85581 was filed before the port to PHP and doesn't really reflect the current state of the codebase. Parsoid is given a ParserOptions which includes user language, and that same ParserOptions will be given to the preprocessor. We have ParsoidOutputAccess::getCachedParserOutput() which splits the cache for Parsoid using the same code that we use for the old parser. As always, there are aspirations to reduce the number of options that split the cache, but that's not linked to the migration to Parsoid anymore.

I'll let the Content Transform team comment on whether they'd rather have users doing {{#timef:now||{{int:lang}}}} or {{#timef:now||user}}.

Comment Actions

Note: any magic words that behave differently based on user language or interface languages need to check whether they have parser cache pollution issues. A simple check: view and purge a page using English, then view the page using (uselang) a different language. Although parser cache will be split if some function is called.

It's possibel to split the parser cache by user language. All you have to do is call getUserLangObj() on the ParserOptions object. -- daniel

Comment Actions

On Commons, the main use-case is that you are provided with a date in YYYY-MM-DD, YYYY-MM, YYYY and few other formats and a language code and need to display that date in that language. That is how it is used by c:Template:Information and other infoboxes. For last 11 years, c:Module:DateI18n (which is a rewrite of even older commons template) does that with preferred formats for each language stored at c:Data:DateI18n.tab. The code has to handle cases where format changes depending on a day (different format for 1st of each month, or sometimes 1st, 11th, 21st and 31st of each month), and some languages adding extra letters and punctuations to the date. The code uses English formatting as defaults (like YYYY for the year) and has to only store formats for languages that deviate from it.

The second use case, need by Module:Complex_date is to put the date in different grammatical case for some languages, for example locative or instrumental cases for Slavic languages or partitive case for Finnish. Translations needed by Complex_date are stored in MonthCases.tab. #time parser function can handle genitive case but other cases are also needed.

On Commons, there is little need for "now" date or figuring out user's language as those are inputs.

Comment Actions

I learnt that the update is productive now, but user language did not work on BETA ever.

From the code I understood that a magic word is required, but even simulating MediaWiki:pfunc-user did not help.

Comment Actions

I learnt that the update is productive now,

If you mean, the base functionality is now live in Wikimedia production, that is correct (as of this week).

but user language did not work on BETA ever.

Yes, that part of the new functionality is not yet merged, as listed on this task.

From the code I understood that a magic word is required, but even simulating MediaWiki:pfunc-user did not help.

It's not yet announced in Tech/News, and the documentation will be added first before the announcement.

Comment Actions

The user magic might be just taken as a reminder.

Okay, then another issue:

Comment Actions

This sort of thing is out of scope. You can file a bug against core if you want changes to the formats themselves.

Comment Actions

It's not yet announced in Tech/News, and the documentation will be added first before the announcement.

From the task resolution, I guess this is ready to be announced now/soon (when?).
Please could someone suggest some wording for the Tech News entry, and specify which documentation it should link to? (And create that documentation if it's still to-be-done.)

I imagine it might be similar to the "{{#dir}} and {{#bcp47}}" entry that we had in (top-entry) https://meta.wikimedia.org/wiki/Tech/News/2024/32 - but I'm uncertain about how to tweak that for accuracy. Thanks.

Comment Actions

I am sorry if it was already explained above, but I am trying to understand what is the new function doing. Is there a documentation for it somewhere?

The description of the request was to move work done by Module:DateI18n lua code to Mediawiki software. Module:DateI18n is ported to 97 different projects, so ideally new interface would be compatible with the existing one. Module:DateI18n main interface is "{{#invoke:DateI18n|Date|year=...|month=...|day=...|hour=...|minute=...|second=...|tzhour=...|tzmin=...|case=...|lang=...}}" function where most of the parameters are optional. On Commons (where it was developed) the frontend is the Template:Date template where there is more documentation. It would be nice to test it with the similar testcases as used in Module:DateI18n/testcases. So how do we interact with the new function?

Comment Actions

I began testing and by try and error figured out some rules :

  1. {{#invoke:DateI18n|Date|year=2009|month=12|day=09|hour=13|minute=20|second=17|lang=en}} is equivalent to {{#timef:2009-12-09T13:20:17|both|en}} - we call it "YMDhms" format
  2. {{#invoke:DateI18n|Date|year=2024|month=09|day=01|lang=en}} is equivalent to {{#timef:2024-09-01|date|en}} - "YMD" format
  3. {{#invoke:DateI18n|Date|month=09|day=01|lang=en}} is equivalent to {{#timef:0000-09-01|pretty|en}} - "MD" format

I did not see any way to ask for localized date in the following formats which are supported by DateI18n:

I wrote some unit testing comparing 3 available formats at c:Module:DateI18n/timef_test and got 42 out of 139 tests correct (30%). The formats we use on Commons are being tweaked, adjusted and argued over since 2009 when Template:Date was introduced with the purpose of localizing dates displayed by c:Template:Information template. They are all vetted by the native speakers and were quite stable last decade or so. Any chance of synching MediaWiki version with Commons and adding those 4 other formats, before we roll it out?

Comment Actions
  1. {{#invoke:DateI18n|Date|year=2009|month=12|day=09|hour=13|minute=20|second=17|lang=en}} is equivalent to {{#timef:2009-12-09T13:20:17|both|en}} - we call it "YMDhms" format

Actually, both is equivalent to the YMDhm format, i.e. without seconds, in almost all languages – the only exception I could find is Finnish, where users can choose a date format with seconds as well (but the default format is still one with a minute-level precision, and the parser function always uses the default format).

Comment Actions
  1. {{#invoke:DateI18n|Date|year=2009|month=12|day=09|hour=13|minute=20|second=17|lang=en}} is equivalent to {{#timef:2009-12-09T13:20:17|both|en}} - we call it "YMDhms" format

Actually, both is equivalent to the YMDhm format, i.e. without seconds, in almost all languages – the only exception I could find is Finnish, where users can choose a date format with seconds as well (but the default format is still one with a minute-level precision, and the parser function always uses the default format).

You are right, I corrected my unit testing page. So I guess we are missing support for "Y", "YM", "YMDhms" and "MDhms" date formats.

I forgot to mention that DateI18n also supports different grammatical cases for a handful of languages (as described in Template:Date template documentation). So without "case" parameter DateI18n returns the date in whichever case given language dictates, and with "case" parameter it will return case in specified case. Most of the time it is nominative or genitive case but for some Slavic languages and the Finish language other cases are allowed. That functionality would also be needed if we going to replace DateI18n .

Comment Actions

I just saw it in the Tech News. Thank you. Could you please write a manual? Because I've tried

{{#timef:2000-09-01|F|en}}

and get an error message. Actually, 49 from 53 standard tests failed, and the rest four showed wrong date.
UPD: I've tried to understand it by myself, and this is what I get:

  1. The first parameter is a regular datetime parameter from #time and #timel. The default value is "now"
  2. The second parameter is "both", "date", "time" or "pretty". The default value is "both".
  3. The third parameter is a language code or "user". The default value is "user".
  4. For the local time #timefl should be used instead.

I do not know if this is right, and if there is something else. Also, pretty does not work in a lot of languages.
UPD: And also, there is a total directionality problem. Until it is fixed, every usage of #timef should be wrapped by

<span dir="{{#dir:<third parameter>}}">...</span>

Until the bug will be fixed, I've created this wrapper template.

Comment Actions

I just saw it in the Tech News. Thank you. Could you please write a manual?

Done.

  1. The third parameter is a language code or "user". The default value is "user".

The "user" keyword was not implemented. Instead we added {{USERLANGUAGE}} (T4085). The default value is the page language, not the user language.

Comment Actions

I just saw it in the Tech News. Thank you. Could you please write a manual?

Done.

  1. The third parameter is a language code or "user". The default value is "user".

The "user" keyword was not implemented. Instead we added {{USERLANGUAGE}} (T4085). The default value is the page language, not the user language.

Great. Thanks. What about "pretty"? There are still enough languages that show not "pretty" and not "date", but just weird irrelevant text.
And what about the directionality?

Comment Actions

What about "pretty"? There are still enough languages that show not "pretty" and not "date", but just weird irrelevant text.

I think it’s because those languages have no pretty localization, so they fall back to the US English order (but with translated month – and year and day, if applicable – names), although mentioning a few specific examples could help better understanding the situation. (Of course, if it’s really the missing localization, the solution is adding it. Maybe a new task could be created in which people speaking different languages could collect the correct formats – I’d be happy to fix them as a developer, but I have no idea what the correct format is in Hebrew, Chinese or Czech.)

And what about the directionality?

I’d assume most of the time this parser function will be used in contexts that have the given language already set (a table that has table-level lang and dir attributes, running text in that language etc.), so the extra markup would just clutter the HTML code without any benefit. Where do you expect to use a free-standing date, without, for example, a label that tells what on that date happens?

Comment Actions

What about "pretty"? There are still enough languages that show not "pretty" and not "date", but just weird irrelevant text.

I think it’s because those languages have no pretty localization, so they fall back to the US English order (but with translated month – and year and day, if applicable – names), although mentioning a few specific examples could help better understanding the situation. (Of course, if it’s really the missing localization, the solution is adding it. Maybe a new task could be created in which people speaking different languages could collect the correct formats – I’d be happy to fix them as a developer, but I have no idea what the correct format is in Hebrew, Chinese or Czech.)

Yes, the English format is exactly the problem. A lot of languages have different formats, and the English one makes no sence. Is there a way to take instead for any language a "date" format as a fallback if there is no "pretty"? I could suggest even removing the year, but I don't, it can be dangerous, because of a need to locate and remove extra spaces, commas, semicolons and so on. And anyway, creation of such subtask while inviting on Tech News people knowing different languages to update the formats could be a good thing to do.

And what about the directionality?

I’d assume most of the time this parser function will be used in contexts that have the given language already set (a table that has table-level lang and dir attributes, running text in that language etc.), so the extra markup would just clutter the HTML code without any benefit. Where do you expect to use a free-standing date, without, for example, a label that tells what on that date happens?

I see. For example, if this date is everything that exists in some table cell.

Comment Actions

And what about the directionality?

I’d assume most of the time this parser function will be used in contexts that have the given language already set (a table that has table-level lang and dir attributes, running text in that language etc.), so the extra markup would just clutter the HTML code without any benefit. Where do you expect to use a free-standing date, without, for example, a label that tells what on that date happens?

I see. For example, if this date is everything that exists in some table cell.

If there is a table cell that requires a single date in foreign representation,

  1. the directionality inside a table cell does not matter,
  2. those who put some content into a table cell do know that the output is supposed to use a different scripting than the general page context, and it might be wrapped by templates or cell attributes.
Comment Actions

I can understand your 2., but I don't agree with 1., because wrong directionalty shows something like "2024 in October, year 14".

Comment Actions

If the entire content of a table cell is that parser function, the UBA does not influence anything.

Comment Actions

I thing we're talking about different things. Here you are, on rtl page:

Comment Actions

I see. Digits before and after, UBA is confused. Perhaps improved one day; an entire block of three groups only (digits letters digits) must not adjust order. There are older and newer versions of UBA, and they might depend on browsers.

I guess it would be sufficient in such cases to add the dir to the table cell:

before ||dir="ltr"| {{#timef:now|date|en}} || next

This is much shorter than introducing a <span> element.

Remark: UBA shall modify if there are LettersLTR Digits LettersRTL Digits LettersLTR or the other way around LettersLTR Digits LettersLTR Digits LettersLTR as in your example. Then it is ambiguous to which fragment the digits belong and shall be rendered before or after. However, if there is no change of directionality since all letters in block are in the same direction no modification by UBA shall happen.

Comment Actions

I guess it would be sufficient in such cases to add the dir to the table cell:

before ||dir="ltr"| {{#timef:now|date|en}} || next

This is much shorter than introducing a <span> element.

Sure, I've started with it and changed to make it more visual to you.

Comment Actions

I see. But how about to check if there is such a field in the correspondent php file?

Comment Actions

I'm arriving from this conversation, where pretty is discussed.

The documentation says:

Not all languages support this; if it is not supported, the "date" format is used.

My questions are:

Comment Actions

if it is not supported, the "date" format is used.

This is wrong. Many languages that has no "pretty" defined use "j F" that has no sence.

Comment Actions

if it is not supported, the "date" format is used.

This is wrong. Many languages that has no "pretty" defined use "j F" that has no sence.

Do you mean that date is not used when pretty is not defined? Or something else?

Comment Actions

Do you mean that date is not used when pretty is not defined? Or something else?

Exactly.

Is there a way to take instead for any language a "date" format as a fallback if there is no "pretty"?

I fear the only way to do so is copying and pasting the format in all languages. The format comes from rMW languages/messages/MessagesEn.php:157-189 (at 73bb50edb4ee), and the comment there says all subclasses, i.e. languages, automatically inherit it.

Comment Actions

Can we have a different name for this nondescript pretty?
If that's day+month call it daymonth or dm.
If people want month+year, introduce monthyear or my.
With pretty, you'll have one thing in a template for one language, and another thing in the template for another language (on Commons, for example), depending on what people decide is pretty in their language.

Would words other than date/time/both/ pretty work? MessagesPl.php define a monthonly option, which is month+year.

Also, since it's so hard (time consuming) to get anything changed in MessagesXX.php, I suggest you ask our most active communities to propose what these should be, and have someone upload and +2-approve the patches on gerrit.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4