From Meta, a Wikimedia project coordination wiki
The language of a Wikimedia wiki can be found in the lang="..."
and xml:lang="..."
attributes of the <html>
element of each page (or other elements for specific subcontents in multilingual pages); they are also used for styling in CSS language selectors. These language codes should generally be canonical language tags as defined by BCP 47.
In most cases, the subdomain names that we use for projects correspond to language codes, but there are some remaining exceptions. This usually occurs for historical reasons, where a valid ISO 639 code (or registered and non-deprecated BCP 47 variant code) was still not available at the time of creation of the project, but also because some former ISO 639 codes where deprecated or removed as they encompassed an group of languages that are now considered distinct.
Deprecated or removed ISO 639 codes are still considered valid in BCP 47 (where existing codes are not removed) most often as possible fallbacks for missing translations or to allow upward compatibility, even if they are no longer recommended for modern use and newly created contents (using these codes can potentially create unsolvable disputes in Wikimedia unless they are distinguished with distinct translations using newer codes). In some cases, some early distinctions in ISO 639 have also been removed because they were introduced artificially for a temporary time (sometimes for non-neutral political reasons) but not well supported by users, and when they unnecessarily complicated the task of translators, or when they too frequently required the use of language fallbacks or automatic transliterators (when a reliable standard and orthographic conventions was adopted between most users of different script variants), or because of development of education for better mutual understanding and acceptation of multiple variants in vernacular use.
Miscellaneous:
als
Local name: Alemannisch
Tracked in
Phabricator:
Alemannic has ISO 639-3 code gsw
. ISO 639-3 code als
is assigned to Tosk Albanian instead.
bat-smg
Local name: žemaitėška
Tracked in
Phabricator:
Samogitian has the ISO 639 code sgs
.
cbk-zam
Local name: Chavacano de Zamboanga
Tracked in
Phabricator:
Chavacano de Zamboanga has no ISO 639 code as an individual language. ISO 639-3 code cbk
is assigned to Chavacano, a superset of Chavacano de Zamboanga.
eml
Local name: emiliàn e rumagnòl
Tracked in
Phabricator:
ISO 639-3 code eml
for Emilian-Romagnol is now retired and split into egl
(Emilian) and rgn
(Romagnol).
fiu-vro
Local name: võro
Tracked in
Phabricator:
Võro has ISO 639-3 code vro
.
iu
Local name: ᐃᓄᒃᑎᑐᑦ / inuktitut
iu
/iku
not a single language, but a macrolanguage comprising ike
and ikt
. MediaWiki agrees (see phabricator), but: falls back to ike
, called ike-cans
; adds ike-latn
; has no ikt
support. CLDR considers Cans an aspirational script. ksh
Local name: Ripoarisch
ksh
is assigned to Kölsch, a subset of Ripuarian. map-bms
Local name: Basa Banyumasan
jv
/jav
is assigned to Javanese, a superset of Banyumasan. nds-nl
Local name: Nedersaksies
nds
. nrm
Local name: Nouormand
Tracked in
Phabricator:
Norman has no ISO 639 code as an individual language (However, two dialects of Norman, Guernésiais and Jèrriais, are sharing ISO 639-3 code nrf
). ISO 639-3 code nrm
is assigned to Narom language instead. ISO 639-3 lumps Norman with French, as with most varieties of northern France.
roa-rup
Local name: armãneashti
Tracked in
Phabricator:
Aromanian has ISO 639-3 code rup
.
roa-tara
Local name: tarandíne
sh
Local name: srpskohrvatski / српскохрватски
sh
was originally the ISO 639-1 code for Serbo-Croatian, but it was deprecated in 2000. However, it remains a valid BCP 47 language tag. There is also the ISO 639-3 code hbs
for Serbo-Croatian. simple
Local name: Simple English
Tracked in
Phabricator:
Simple English has no ISO 639 code but has a registered IETF variant subtag simple
However, even if the simple
code is valid as a standard subtag for BCP 47, because it is only registered as a generic subtag for language variants for various base languages like en-simple
or fr-simple
(using the now standard variant subtag is preferable to using multiple subtags including an unregistered private extension, like "en-x-simple"). As a plain tag, "simple" means nothing in BCP 47 or ISO 639 (as it is not a plain language).
Note that under ISO 639 rules, Simple English is a variant or dialect or special orthography of English (so it can be registered as a variant subtag of English, like "formal" or informal" used in German or Dutch), defined as a subset for some limited usage. The IANA database for IETF's BCP 47 already indicates this and BCP 47-aware applications should have no problem to identify the language as being part of normal English, as long as it is properly tagged as "en-simple" and not just "simple".
zh-classical
Local name: 文言
Tracked in
Phabricator:
Tracked in
Phabricator:
Classical Chinese has ISO 639-3 code lzh
.
zh-min-nan
Local name: 閩南語 / Bân-lâm-gú
Tracked in
Phabricator:
Tracked in
Phabricator:
Min Nan has ISO 639-3 code nan
.
zh-yue
Local name: 粵語
Tracked in
Phabricator:
Tracked in
Phabricator:
Cantonese has ISO 639-3 code yue
.
Miscellaneous:
tokipona
– defunct Wikipedia subdomainru-sib
– defunct Wikipedia subdomain, hoax in fictional “Siberian” languagebe-x-old
– fixed and redirected to be-tarask
Wikipedia subdomain (see phab:T11823)ms
Local name: Bahasa Melayu
There are many individual languages under "ms"/"msa", including Indonesian ("id"/"ind"), Banjar ("bjn"), Minang ("min"), three living languages with their own Wikimedia projects, as well as Malay (individual language) ("mly"-Deprecated 2008 or "zlm"-Malay or "zsm"-Standard Malay / Malaysian Malay / Malaysian language)
It should be noted that the Malay Wikipedia, Wikibooks, and Wiktionary all predate the change in the language code in 18 February 2008, with the latest one, Malay Wikibooks, created on 24 August 2004.
See also:
ak
Local name: ak
de-formal
Local name: Deutsch
nl-informal
Local name: Nederlands
The special language code qqx
can be used to display the ids of all system messages used on a page.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4