This is an archive of past discussions about Template:Lang. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
List of most-transcluded nonexistent ISO 639 templates
Following up on the section above, here's the list of the most-transcluded ISO 639 templates. To see the full list with the links to articles, visit Special:WantedTemplates and search for "ISO 639". All of these templates were redlinked when I posted them here; the blue-linked ones have since been created.
{{ISO 639 name rtl}} (29 links) – this is the "right-to-left" parameter for {{lang}}: should be changed to rtl=yes in individual pages; fixed in mainspace
{{ISO 639 name syrc}} (29 links) – Should be Assyrian Neo-Aramaic ("aii-Syrc")
{{ISO 639 name ml-Mlym}} (7 links) – redundant script, should just be "ml"; fixed in mainspace
Some are typos. Some may need to be created. Others need to be fixed. An error category would make it easier for us to find and fix articles that are using these nonexistent templates. – Jonesey95 (talk) 04:00, 29 September 2016 (UTC)
Hey! I recently implemented a few hundred {{lang}}s, and I'm pleased to say that I don't think I'm responsible for much of the above. However, I probably did most or all of the ja-latns. My intention was to tag them as being in Japanese, but transliterated into Latin characters, not in the native script. I thought ja-latn was the way to show this. How should I have tagged these? Thanks. Phil wink (talk) 23:56, 30 September 2016 (UTC)
Thanks. I've fixed the 14/16 ja-latn which were mine, plus one more I found (Haedong, a disambiguation page which also contained a few other -latns), but I was unable to find the 16th case. Cheers. Phil wink (talk) 17:35, 2 October 2016 (UTC)
There is a link above to Special:WantedTemplates that will take you to a page with links to all transclusions of each nonexistent template. I followed it and found that only this page and a user sandbox page use the ja-latn template now. Good work! – Jonesey95 (talk) 18:50, 2 October 2016 (UTC)
Module solution
It seems odd to have so many templates just to return language names. It would be neat to have a module. Then all the codes and language names can easily be viewed and modified. I'm new to module coding, but it's such a simple thing that it shouldn't be too hard to make one. — Eru·tuon00:20, 1 October 2016 (UTC)
Agreed, this will be much better. However, writing a module to look up the codes might be trivial, but rewriting all the templates that depend on that definitely isn't. Uanfala (talk) 00:29, 1 October 2016 (UTC)
can get you a code's matching language name. It isn't perfect because it doesn't include all of the ISO 639 codes. For example, the above returns 'Arabic' but the valid ISO 639-2 code 'ara' returns an empty string.
That source appears to be only ISO 639-3 codes that are not part of 639-1 or 639-2. This data file, for example, does not list code 'ara' which is both 639-2 and 639-3.
@Trappist the monk: Thanks for the suggestion! I wasn't aware that was possible. I just implemented it, though I replaced mw.getContentLanguage():getCode() with 'en'. Doing that might save a little processing time. — Eru·tuon19:26, 1 October 2016 (UTC)
Now the module's Wiktionary function is pretty reliable, so {{wikt-lang}} now uses it. I'll see if I can make it perform the functions of {{lang}} next; as of yet it just adds language attributes. — Eru·tuon22:34, 2 October 2016 (UTC)
The {{COinS safe|n}} notice at the top is contradictory to the documentation text "To suppress this – e.g. when using {{lang}} within a wikilink or the title parameter of a citation – add the parameter |nocat=true.". Either "or the title parameter of a citation" needs to be removed, or the {{COinS safe|n}} notice needs to be relaxed accordingly. ~Tom.Reding (talk ⋅dgaf)15:43, 2 November 2016 (UTC)
Rendering Hebrew
Forgive me, if this should be addressed somewhere else, but this was the most fitting place I could find. It puzzles me that Hebrew text renders differently (at least in my browser) depending on whether it's formatted with {{hebrew}}} or {{lang|he}}}. Should the latter not imply the former? Take for instance the first words of the shema':
In the first case, it's rendered nicely in a suitable Hebrew-friendly font (SBL Hebrew, I think), whereas the latter example renders in some ordinary standard-font not well suited to Hebrew. Often in a given Wikipedia article occurrences of Hebrew words will be templated differently (and thus rendered in different fonts), which gives a very haphazard and inconsistent look to the reader. This can hardly be intended? Am I missing something? ——Pinnerup (talk) 17:00, 16 February 2017 (UTC)
Short answer: {{hebrew|...}} specifies a list of fonts for your browser to pick from; {{lang|he|...}} doesn't, it merely declares the enclosed text as being in the Hebrew language, and leaves it up to your browser to pick a suitable font from those installed.
. In the first of these, the class="script-hebrew" attribute doesn't do anything special; the dir="rtl" specifies right-to-left text; and in the style="..." attribute, the font-size: 115%; declaration is self-explanatory, whilst the font-family: Alef, 'SBL BibLit', 'SBL Hebrew', 'David CLM', 'Frenk Ruehl CLM', 'Hadasim CLM', Cardo, Shofar, David, 'Ezra SIL', 'Ezra SIL SR', 'Noto Sans Hebrew', FreeSerif, 'Times New Roman', FreeSans, Arial; declaration specifies a list of fonts which might be suitable for the enclosed text - they are listed in decending order of priority, so if your browser is like mine and doesn't have any of the first thirteen installed, but does have Times New Roman, it will use that font.
In the second of these, the lang="he" attribute specifies the Hebrew language; the xml:lang="he" does the same and is in fact redundant. Notice that no font is specified, nor even the script direction, which is implied by context. The purpose of the {{lang}} template is to declare a language for the enclosed text, nothing more.
Thank you for your very informative and comprehensive answer! That explains a lot. After reading it I found that with some fiddling I was able to configure my browser to display text marked with lang="he" in a suitable Hebrew font. I take it best practice is to only use {{lang|he}}? I notice, however, that {{lang-he-n}} calls {{hebrew}}. Should I avoid using that? —Pinnerup (talk) 13:54, 19 February 2017 (UTC)
This edit request to Template:Lang has been answered. Set the |answered= parameter to no to reactivate your request.
Please change the way that articles are categorized by the "unknown language" check by changing [[Category:Articles containing unknown ISO 639 language template]] to [[Category:Articles containing unknown ISO 639 language template|{{{1}}}]].
I believe that this will sort pages in the tracking category by the value of the language code that is used. This will help me (and other gnomes) identify groups of articles that have a common error or a common language for which a template needs to be created. Thanks. – Jonesey95 (talk) 07:16, 3 March 2017 (UTC)
{{Lang-nl-BE}} doesn't work, but should, and should render as "Flemish (Belgian Dutch)" probably. Some linguists classify Flemish as a language (or even multiple languages), not a dialect or dialect continuum of Dutch (much the way Scots is considered a separate language from English, not a variant of it), but I think we're stuck with nl-BE for now. ISO defines separate codes for two forms of Flemish, West Flemish and Limburgish, but not the other two, nor Flemish a whole. Many sources for this or that which we may need to mark up with a template are not specific and just say "Flemish", so it's going to be original research for a Wikipedian to try to use one of the more specific ISO labels. However, just using {{lang-nl}} is inaccurate and a disservice to readers. Ergo, we need {{lang-nl-BE}}. — SMcCandlish ☺☏¢ ≽ʌⱷ҅ᴥⱷʌ≼ 22:10, 21 August 2017 (UTC)
nl is ISO 639-1 language code for Dutch; BE is ISO 3166-1 country code for Belgium. The 639 table also lists Flemish as a code nl language. See also code nld @ sil.org.
Flemish contributes extensively to the size of Category:CS1 maint: Unrecognized language because it is-a-language-that's-not-a-language. For cs1|2, we could modify Module:Citation/CS1 to accept |language=Flemish but we also require that there be a code from which we can render a language: |language=de → (in German). Code nl will always render as Dutch so we could, in lieu of decision from recognized authorities make up our own. nl-BE would work if there is nothing better.
I have created two templates and a category to begin support for this language/dialect. Let me know if more are needed. – Jonesey95 (talk) 01:06, 22 August 2017 (UTC)
Is there any guidance as to whether this template should be used for Fraternities? For example, should it simply be ΦΒΚ or should it be ΦΒΚ (from {{lang|el|ΦΒΚ}})Naraht (talk) 08:56, 23 September 2017 (UTC)
Completely incorrect advice
Presently there some text that says:
Do not use quotation marks in your user style sheet; they may be misinterpreted as wikitext. While they are recommended in CSS, they are only required for font families containing generic-family keywords ('inherit', 'serif', 'sans-serif', 'monospace', 'fantasy', and 'cursive'). See the W3C for more details.
This is wrong on either every detail or almost every detail.
Quotation marks are not needed around generic family keywords. They're only needed around actual font names that contain spaces or other non-alphanum characters, which is a lot of them, or start with digits. These quotation marks are not optional in such a case, though many browsers gracefully decline to choke to death if you leave them out. They're usually double-quotes not single-quotes, unless one is using inline CSS inside HTML, e.g. in <span style="font-family: 'Lucida Sans Unicode', sans-serif;">...</span> or whatever.
If it actually is true that "quotation marks in your user style sheet ... may be misinterpreted as wikitext", this is a very severe MediaWiki bug which needs to be addressed immediately. If this were the case, I think we would have heard about it by now. — SMcCandlish ☺☏¢ ≽ʌⱷ҅ᴥⱷʌ≼ 03:48, 29 September 2017 (UTC)
It goes back twelve years, to this edit at 09:04, 27 January 2005 (UTC) by Mzajac (talk·contribs). In those days, template documentation was on the talk page - we didn't use /doc subpages. Subsequent relevant edits include:
Maybe back in 2005 the MediaWiki software behaved differently to now. Or perhaps it was for browsers with an incomplete or improper implementation of CSS 2.1, such as Internet Exploder 6. CSS 2.1 is still largely current: the relevant document in CSS 3 is CSS Fonts Module Level 3 (3 October 2013), which being a W3C Candidate Recommendation is not yet a full W3C Recommendation. The section concerned is 3.1 Font family: the font-family property. --Redrose64 🌹 (talk) 11:15, 29 September 2017 (UTC)
Redrose64 has indicated the latest advice from W3C. A summary of that advice is:
There are two types of font names: family and generic.
Generic names are serif, sans-serif, cursive, fantasy, and monospace – these fonts are supplied by the user agent (browser) itself.
It is recommended that the list of fonts supplied to font-family has a generic name as the last entry to allow a guaranteed fallback should none of the named family fonts be available to the user agent. The generic font name at the end of the list must not be quoted.
It is recommended that family font names that contain spaces, digits or punctuation (other than hyphens) are quoted.
Font family names that contain the following words must be quoted: inherit, serif, sans-serif, monospace, fantasy, cursive, initial and default.
How to display a Japanese ellipsis without resorting to ・・・?
Hello, I would like to display … as wikipedia does here. Wikipedia uses a template with "span lang", so I've tried to do it the same way, but the result is this. Is there any way to display … as in the aforementioned article? Seelentau (talk) 20:30, 1 July 2017 (UTC)
@Redrose64: I would like to use it in a wikia-wiki, where the lang-template doesn't exist. And … is displayed at the bottom of a letter height, whereas on said wikipedia page, the ellipsis is displayed in a similar fashion as ・・・ (which is just three ・). I would like to avoid using ・・・, but simply using … comes up as … for me. Seelentau (talk) 20:55, 1 July 2017 (UTC)
Yup, was just about to write that. :) By changing it to sans-serif, it works: <span lang="ja"><font face="sans-serif">…</font></span> produces …Seelentau (talk) 21:56, 1 July 2017 (UTC)
You can contract that - <span lang="ja" style="font-family:sans-serif">…</span> produces … - another reason for doing so is that the font element is obsolete. --Redrose64 🌹 (talk) 22:03, 1 July 2017 (UTC)
Ah, okay, I will do that, thank you! :) But one more problem is that I can't use it in article titles. For example, the song name "Unknown…Despair…a Lost" has to be titled "Unknown…Despair…a Lost". Is there no character that is actually … and not a modified …? Seelentau (talk) 22:15, 1 July 2017 (UTC)
Set the article title without attempting to style it. Then, at the top of the article, use {{DISPLAYTITLE:Unknown<span lang="ja" style="font-family:sans-serif">…</span>Despair<span lang="ja" style="font-family:sans-serif">…</span>a Lost}} - see mw:Help:Magic words#Technical metadata. --Redrose64 🌹 (talk) 22:28, 1 July 2017 (UTC)
Oh yes, displaytitle exist, completely forgot^^ I will do that, but do you know why the ellipsis is actually displayed this way? The background of all of this is Japanese, by the way, and their ellipsis is always displayed as …, but when I copy it (for example, from here), it's simply …. Then again, for some, … is displayed as … from the start... does Firefox not support that? Seelentau (talk) 22:36, 1 July 2017 (UTC)
@Seelentau: It looks like this has to do with the font that the browser chooses. In my browser (Chrome), the ellipsis character is displayed in the font Meiryo when marked as Japanese (…), but in the font Arial otherwise (…). To avoid this inconsistency, you might be able to use the "midline horizontal ellipsis" character (U+22EF, ⋯) instead of the horizontal ellipsis (U+2026, …), which displays the correct way even in Arial. But note that that is probably technically incorrect because U+22EF is in the mathematical operators block and categorized as a symbol rather than a punctuation character (FileFormat.info page). — Eru·tuon01:17, 11 October 2017 (UTC)
Recent change
WOSlinker has recently changed some (or all?) lang templates to use html for italics. Has that change been discussed anywhere? Does it improve anything? Because it has caused some problems: forms such as {{langx|it|'Livorno'}} now display as 'Livorno' instead of Livorno. Could someone please fix this (or, if the change is an important improvement, give some hint as to how the various affected pages could be tracked down and fixed)?
What we really need is a font style parameter for this template (yes, I know that bold is technically a font weight); while italics are commonly used for words in other languages, they are not used for proper names – and the templates are often used for proper names, which sometimes need to be bold-faced. Is there any reason why this couldn't or shouldn't be implemented? Justlettersandnumbers (talk) 10:21, 16 October 2017 (UTC)
Hmm, it looks as if this should have discussed before it was implemented. May I suggest that, pending the outcome of such a discussion, someone with smart rollback and template editor permissions roll back these edits (which I think are these (about 369), plus one a little earlier and three the previous evening, two of them not to {{lang}}-foo templates)? That'd fix the errors for now, without prejudice to doing this properly if there's consensus that it is what's wanted. Justlettersandnumbers (talk) 13:27, 16 October 2017 (UTC)
I would do it, but I have been brought to ANI for unbreaking templates in the past, and there I was accosted by administrators who would not read, could not read, or both. I learned my lesson from that experience, which was "It's better to be happy than right". I support reverting these changes, however. I would like to hear back from WOSlinker, though, whose editing knowledge and skills I respect greatly. – Jonesey95 (talk) 14:07, 16 October 2017 (UTC)
I've changed all my edits on the lang templates back to the wiki style italics. There are two lang templates still using ther html style but I've never edited those. -- WOSlinker (talk) 13:01, 17 October 2017 (UTC)
The subtag registry identifies yue as a language code. See Yue language where yue is identified as the ISO 639-3 language code.
{{lang|yue|係}} → 係
It may once have been true the the correct code was zh-yue (language subtag with an extlang subtag). According to the subtag registry, that is no longer true and the preferred subtag is yue. {{lang}} and Module:lang do not currently (may never) support extlang subtags.
Thanks, I see that it’s deprecated now. I can’t remember ever using it, but recalled it from a previous browse of the registry and so thought it OK. I guess with the template previously passing through anything, but now actually checking what’s passed to it, there will be quite a few like this to fix.--JohnBlackburnewordsdeeds16:12, 19 November 2017 (UTC)
I've been keeping an eye on Category:Lang and lang-xx template errors looking for indications that the new module is doing the wrong things. I haven't seen any zh-yue errors. But, I have seen plenty of zh-han and zh-t errors. grc-gre is fairly common as is jp.
{{langnf}} is calling {{lang}} without providing a valid ISO 639 language code. {{lang}} requires (has always required) a language code so that it knows how to correctly supply html markup for the text. In this template, the first positional parameter, {{{1}}}, the language code, is empty:
{{langnf||Hebrew|"The Hope"}}
Many might leap the the conclusion then that they should add the language code that matches the language name. They would be wrong. In this case, the correct code is en because "The Hope" is English.
It appears that the documentation for {{langnf}} is inadequate. I can also imagine, though have not given it sufficient thought to recommend, that in {{langnf}} this line:
}} for {{Lang|{{{1|}}}|{{{3}}}|rtl={{{rtl|}}}}}<noinclude>
might be changed to:
}} for {{Lang|{{{1|en}}}|{{{3}}}|rtl={{{rtl|}}}}}<noinclude>
if {{{3}}} is usually an English translation. If {{{3}}} is always an English translation then there is no need for {{lang}} in {{langnf}}
Perhaps Editor Hyacinth, the original author of both {{langnf}} and its documentation, can be persuaded to revisit that template.
It seems that it used to work in the past without a language code in the first parameter, though. If you go to Template:Language with name/for today, you can see (or at least I can see) that the examples in the documentation do not produce errors. If you null-edit the template, they will produce errors. I have not yet looked at the old {{lang}} code to puzzle out this apparent effect. – Jonesey95 (talk) 23:51, 19 November 2017 (UTC)
I put the old lang template code in Template:lang/sandbox2 and used that template in the sandbox for {{langnf}}. It appears that leaving the language blank did not produce an error in the past:
{{lang/sandbox2||Foo}} → {{lang/sandbox2||Foo}}
{{Language with name/for/sandbox||2=German|3=[[Thuringia]]}} → Error: {{language with name/for}}: missing language tag or language name (help)
{{Language with name/for/sandbox|en|2=German|3=[[Thuringia]]}} → German (English for 'Thuringia')
{{Language with name/for||2=German|3=[[Thuringia]]}} → Error: {{language with name/for}}: missing language tag or language name (help)
{{Language with name/for|en|2=German|3=[[Thuringia]]}} → German (English for 'Thuringia')
It appears to me that even though the language name in {{lang}} was listed as a Required parameter, there may not have been code that enforced that requirement. Still researching. – Jonesey95 (talk) 00:06, 20 November 2017 (UTC)
seems that it used to work. The output of this particular example when rendered by the previous version of {{lang}} looks like this:
[[Hebrew language|Hebrew]] for <span lang="" >"The Hope"</span>
The purpose of {{lang}} is to indicate that the text belongs to a language. If the language is going to be English there isn't much sense in calling {{lang}}, no need to wrap the text in <span>...</span> tags. Because {{lang}} expects to have a language code so that it can do its job, the module whines and complains when that important piece is missing. {{lang}} cannot know that the template that's calling it doesn't really need its services. So it complains.
But, that won't work because the value assigned to {{{1}}} is not an ISO 639 language code. Module:lang rejects 'Arabic' because it expects a code, not a language name so instead of rendering bogus html it emits an error message.
The quick fix? There are at least two and probably more.
|lang2=ar
|lang2_content=مسجد داليان<br/>(''Masjid Dālyān'') – were it me, I would remove the <br /> and the transliteration because that is left-to-right and Latn script.
For the time being, because there are a lot of {{lang-??}} templates that call {{lang}} and a lot of them impose italics on the 'text', the italic detection and associated error messages are disabled so in future the error message will be back.
A long-term fix to properly support the transliteration of the Arabic is needed and will require modifications to {{Infobox Chinese}} and {{Infobox Chinese/Blank}}.
{{Infobox Chinese/Blank}}. Ugh. I keep intending to have a go at rewriting Infobox Chinese to use Lua, but every time I’ve looked at it I’ve been put off by things like that. It’s not even clear that belongs in the template, which seems to have grown to do too much over the years. People don’t notice or object as most fields in the template default to hidden, but if they are hidden so no-one sees them they probably aren’t all needed.--JohnBlackburnewordsdeeds13:30, 20 November 2017 (UTC)
The code grc-gre is not a valid IETF language tag (see my comments at "Module:Language/data/wp languages" ?) so Module:lang emits an error message because it cannot make sense of the 'code'. There is a related template, {{lang-grc-gre}}, which has documentation that, to this reader, is far from clarifying. That template does not emit an error because it drops the -gre thing and calls {{lang}} with only the IANA/ISO 639-3 language code.
It does not appear that grc-gre is a Linguist List code so I would guess that someone here at en.wiki created it.
It might make sense to treat any foo-bar as just foo any time the foo-bar combo doesn't resolve. I'm skeptical we can prevent people adding -bar instances that don't resolve to something in our table, since they're introduced (albeit slowly) all the time, e.g. in linguistics papers. PS: grc-gre was previously discussed in an older thread, above: #zh-yue. — SMcCandlish☏¢ >ʌⱷ҅ᴥⱷʌ< 22:27, 20 November 2017 (UTC)
Clearly we differ on what constitutes a 'discussion'. At #zh-yue, I merely mentioned grc-gre as a common cause of error messages displayed by Module:lang.
For templates that truly don't use IETF tags, I think there is nothing to 'fix'.
There is one, {{lang-ca-valencia}} that has prompted me to tweak Module:lang/sandbox so that when the IETF language tag includes a variant, the module fetches the language name from the variants table:
I'm having second thoughts about this sandbox tweak. Consider:
{{#invoke:lang/sandbox|lang_xx_italic|code=pt-ao1990|text=some pt text}} → Portuguese: some pt text
[[Portuguese language|Portuguese]]: <i lang="pt-ao1990">some pt text</i>[[Category:Pages using Lang-xx templates]]
Not what we really want. Perhaps an alternate language parameter that when concatenated with ' language' can be used to replace the default language name (from the data tables) so that we link to the variant language name article.
From my previous work with the ISO 639 name templates, my experience is that every language and dialect has an article or redirect at "XXXX language". The templates and categories depended on that construction. I messed with hundreds of those templates, and I do not recall encountering any missing articles or redirects. See, for example, Middle Scots language, which would be the destination for "sco-smi" below if we could put in some sort of override.
I think you meant Module:lang/data and in particular the override table.
You added sco-smi which looks like a valid IETF language tag but is not. Were it valid, smi would be listed as an extlang in the IANA language-subtag-registry file. At present, there are no plans to support extlangs because there are preferred language codes for all of the existing extlangs.
Because Module:lang expects a valid IETF language tag, it emits an error when it disassembles sco-smi into its separate parts, the code sco and this other thing smi which doesn't match the required patterns for script (4 letters), region (2 letters or 3 digits), or variant subtags (4 digits or 5–8 alphanumeric characters).
It may be that we will want to create a table that specifically holds Linguist List codes so that we can handle them. The question that I have about any of these codes that are not in language-subtag-registry file is: What to put in the lang="" attribute of the enclosing <span>...</span>? Browsers and screen readers probably don't know about (aren't required to know about) 'private' language codes that aren't in the registry.
Sorry, I should have linked to the documentation in question. I meant Module:Lang/doc, which refers to files that apparently do not work. Should those references be removed in order to avoid confusion? – Jonesey95 (talk) 16:38, 20 November 2017 (UTC)
What do you mean by files that apparently do not work?
The only place where code 1ca is defined for Module:lang is in Module:lang/data in the override table. That code works as it should (there is no extlang tacked onto it):
{{#invoke:lang|lang_xx_inherit|code=1ca|text=كَیکاوس|rtl=yes|italic=no}} → [كَیکاوس] Error: {{Lang-xx}}: unrecognized language tag: 1ca (help)
Adding sco-smi to the override table doesn't work because the module has extracted the language code (sco) from it and cannot find that in the override table and there is, at present, no mechanism to make the module search for the (invalid) extlang either alone (smi) or in combination with the language code (sco-smi).
Actually, it does. Module:Language/name/data creates a single table from the /wp languages, /ISO 639-3, and /iana languages modules. The first module read is /wp languages. First language 'code' into the composite table wins so when code exists in all three of the data modules, only the code and data in /wp languages is used. For example, code gem is present in both /wp languages and in /iana languages. Module:Language/name/data reads /wp languages first so its value for code gem (Proto-Germanic) is the value used by Module:lang; not the 'official' Germanic languages:
{{#invoke:lang|lang_xx_italic|code=gem|text=Example text}} → Germanic languages: Example text
The 'codes' that do not work in /wp languages are the hyphenated codes.
I am writing this mostly as a note-to-self before I jump on the plane to Elsewhere.
It occurs to me that we can make use of the IETF language tag's support for private-use subtags. So, subtags that we have invented, like code grc-gre, might be renamed grc-x-gre. When the module sees the -x-subtag, it knows that subtag is non-standard and will look for that code in a special wp_private_subtags table for the language name. Someone else apparently had a similar idea because be-x-old exists in Module:Language/data/wp languages.
Another thing we might do, if and when we add support for label control (see #Wish list for future enhancement), is to overload that parameter so that |label=none hides all labels (language name, transliteration and translation static-text) and |label=name display's a name different from the name usually associated with the code. I'm not sure if there are any real benefits to this particular idea.
Possibly the usefulness for the |label= idea: the label provided by the current {{lang-de-CH}} is 'Swiss German' but the language name retrieved from the module's language tables is 'German' so in that template we might write: {{#invoke:lang|lang_xx_italic|code=de|label=Swiss German}}.
This would also be useful for suppressing the appearance of the same word in successive instances of the template (Foo, Bar, and Baz Quuxian, instead of Foo Quuxian, Bar Quuxian, and Baz Quuxian). Also useful for cases where one ethnic or national group uses one name for the language and a neighboring one does different. — SMcCandlish☏¢ >ʌⱷ҅ᴥⱷʌ< 18:15, 21 November 2017 (UTC)
Good day! Why do we use italics in this template, but not in others? Take a look here. However, when I tied using the template here, it didn't italicize: Uzbek: Oʻzbek gimnaziyasi/Ўзбек гимназияси; Russian: Узбекская гимназия; Kyrgyz: Өзбек гимназиясы. But the the uz template does italicize its content in articles. Can you change it so that it doesn't? Shouldn't these templates be uniform? Nataevtalk15:25, 28 November 2017 (UTC)
I changed the template a couple of hours ago. If you click Edit on an article and then save it without making any changes (this is called a "null edit"), the italics should go away. – Jonesey95 (talk) 15:51, 28 November 2017 (UTC)
There is more to it than that. The example holds text in two scripts when it should not (Uzbek can be / has been written in three):
{{langx|uz|Oʻzbek gimnaziyasi/Ўзбек гимназияси}}
The first part of that text (left of the solidus) uses Latin script, the second part uses Cyrillic script. The Latin should be italicized but the Cyrillic should not be. {{Lang}} and the {{lang-??}} templates do not support more than one script simultaneously (there is an expectation that in future, multiple scripts will be supported; see #Wish list for future enhancement). The third Uzbeck script is Arab which, unlike Latn and Cryl, is written right-to-left so requires special handling.
We are in a transition so there is a mix of old and new. For the time being, I would write the example:
Order reversed because of Editor Jonesey95's template edit. I would note here that that edit doesn't really let lang module manage italics because {{lang-uz}}'s call of {{lang}} (from inside {{language with name}}) doesn't give the module any direction on how italics are to be managed.
This is something that I had intended to do but was leaving to later. I have replaced the non-working variant validation/consolidation code at the bottom of get_ietf_parts() with the working code from the live module. This new snippet also includes more helpful error messaging. Apparently there is something wrong with how Lang/sandbox or Lang/codes/testcases evaluates/renders actual column results. In the three failures, where is the fourth table element?
{{lang-??}} templates using the module are allowed to use |script=, |region=, |variant= to supplement the IETF language tag provided by the template (overriding is not currently permitted but is contemplated). The values supplied by these parameters are validated in get_ietf_parts() and why each of script, region, and variant are all individually made lowercase; not done all at once as you have done in /sandbox.
I don't know what's happening to make the tables have only three elements, but I'll look into it.
I don't understand how what you are saying relates to letter case. Could you clarify? At which point is letter case significant in the function get_ietf_parts? I do notice now that language code is never lowercased, so perhaps letter case for it should be preserved, and an error be triggered for something like {{lang|GrC|...}}. — Eru·tuon20:36, 30 November 2017 (UTC)
In that call, args.script, args.region, and args.variant come from the template parameters |script=, |region=, and |variant= respectively and can be any case (IETF language tags have no standardized case; there is a 'common' way of writing the various subtags that, by some sort of convention, uses particular case – we mimic that in format_ietf_tag())
In get_ietf_parts() (Module:Lang) the parsing (that part you are rewriting) is case insensitive. Once parsed, we look to see if any of the template parameters (|script=, |region=, and |variant=) is set. If any of these is set, and there is no matching subtag in source, then we assign the template parameter's value to the appropriate local variable (lines: 194, 209, and 224). Then we validate. Before we can do that, we down-case the content of the variable in question because the data tables are all indexed with lowercase keys (because of __preprocess() in Module:Language/name/data).
In Module:Lang/sandbox you down-case only the value in source. That works fine for {{lang}} which doesn't support the subtag parameters but won't work for the {{lang-??}} which do/will support them.
I don't think my changes have fixed anything, though. I used to understand how these templates worked, but with the module-ization still in progress, I'm a bit at sea. – Jonesey95 (talk) 23:26, 1 December 2017 (UTC)
sem is an ISO 639-2 collective. See sem @ sil.org and their definition of Collections of languages – which definition, to me any way, is rather obtuse. The correct name for sem is 'Semitic languages' and this is the name that {{lang}} was getting from the IANA data table and the name that Module:Lang used when creating the category link for that code. For the time being, I have created an entry in the override table in Module:Lang/data. This will work until someone creates a {{lang-sem}} template by which time we may have figured out how to handle collectives. I'll add notes and TODOs in the appropriate places.
Not sure that this is the venue to discuss the ISO 639 name templates. {{Lang}} and the {{lang-??}} templates are abandoning all of the ISO 639 name templates along with the templates that might have called them.
One issue is that some of these "collections" are language families with a proto-language (for instance, the Semitic languages with Proto-Semitic), and in that case the code for the language family is sometimes used for the proto-language. For example, in {{proto}}, sem is used as the code for Proto-Semitic. Wiktionary distinguishes the proto-language by appending -pro (sem → sem-pro). This is because language and codes must be distinct, as they are both used in etymology templates and there are distinct categories pertaining to each (for example, "Terms derived from Semitic languages" and "Terms derived from Proto-Semitic"). But I don't know if on Wikipedia this polysemy (a code being used for both language family and proto-language) will cause similar problems or what solution would be appropriate. — Eru·tuon00:35, 2 December 2017 (UTC)
This makes my brain hurt. I'm not sure that we care what is done at {{proto}}. There, the code just feeds a {{#switch:}} that chooses a wikilinked article name to precede the 'text'. The template takes no care to properly identify the language in metadata as {{lang}} does.
IANA doesn't apparently recognize proto (that word isn't in the registry). The only 'fit' would be as a variant (5-character length) but because proto isn't registered with IANA, shouldn't the proper form be a private use subtag: sem-x-proto? We should not / must not redefine tags that are already defined by international standards organizations.
cel – IANA name: Celtic languages; WP name: Proto-Celtic, a redirect to Proto-Celtic language (ISO 639-2 collective)
gem – IANA name: Germanic languages; WP name Proto-Germanic, a redirect to Proto-Germanic language (ISO 639-2 collective)
pgl – IANA name: Primitive Irish; WP name: Proto-Irish, a redirect to Primitive Irish (ISO 639-3 individual)
Of those, pgl should probably be deleted from the WP languages table because it is an ISO 639-3 individual language so we should be displaying 'Primitive Irish' with {{lang-pgl}}. The other two inappropriately redefine the international standards organizations' code/name assignments so if we are to keep them as 'Proto-something' then we should create correct private use subtags cel-x-proto and gem-x-proto.
Just to clarify, I'm not proposing that Wikipedia use the same convention (-pro) as Wiktionary. Wikipedia wants to follow external standards, while Wiktionary is perfectly comfortable with creating its own idiosyncratic hyphen-containing language codes that have nothing to do with IETF subtags. So Wikipedia and Wiktionary are incompatible here. Your idea of a private use subtag sounds more consistent with Wikipedia's preferences. — Eru·tuon06:05, 2 December 2017 (UTC)
Hi. Can you have a look at Poor Dionis? Two blocks of text, both of which have italics for just one sentence, have vanished, and, looking over the potential fixes, I could find nothing to address the specific problem. Dahn (talk) 12:17, 9 December 2017 (UTC)
I think I've fixed it now (converting on the way from {{lang-xx}} to simply {{lang}} as I don't think it's necessary to add language labels, but feel free to reinstate them using template-external text). Now, the problem (visible in this old revision) was that lang-xx templates were used for an entire paragraph of text, and this text contained within it italicised phrases. The template assume that the markup is meant for the whole text and see it as an error, but that's a legitimate use. Is there any way to fix that? – Uanfala (talk)12:44, 9 December 2017 (UTC)
Thank you, that's a very good solution. As for the rest: I'm sure the problem can easily pop up in other templates where the same was used, so maybe it's a good idea to add that to the list of potential script errors? Dahn (talk) 12:52, 9 December 2017 (UTC)
Question re: bolding and ' ' marks
G'day, I use template:lang-sh-Latn a fair bit, and I've noticed it is now rendering with the sh word like this 'bold'. See the Background section of Yugoslav coup d'état for examples. Is there a recent change that has caused this? It doesn't comply with MOS:BOLD etc. Thanks, Peacemaker67 (click to talk to me) 08:25, 27 November 2017 (UTC)
Yes. We're in a transition period where things don't always work as expected. The problem with {{lang-hbs-Latn}} is that its language code, hbs-Latn, includes a script subtag, Latn, that specifies italics and, internally, the template also includes italic markup: ''{{{1}}}'' which together created: ''''{{{1}}}''''. I've converted {{lang-hbs-Latn}} to use the module:
@Trappist the monk: Here's a weirdness, with {{lang|arc-Latn}} and presumably some others:
On this page, this italicizes, but in mainspace the link markup is broken and it's non-italic: [[Frahang-i Pahlavig|{{lang|arc-Latn|hozwārishn}}]] → hozwārishn
On this page, this italicizes, but in mainspace the link markup is broken and it's italic: ''[[Frahang-i Pahlavig|{{lang|arc-Latn|hozwārishn}}]]'' → hozwārishn
On this page, this italicizes, and it renders (italic) in mainspace: {{lang|arc-Latn|[[Frahang-i Pahlavig|hozwārishn]]}} → hozwārishn
On this page, this does not italicize, and it renders (non-italic) in mainspace: ''{{lang|arc-Latn|[[Frahang-i Pahlavig|hozwārishn]]}}'' → hozwārishn
On this page, this italicizes, and it renders (italic) in mainspace: {{lang|arc-Latn|[[Frahang-i Pahlavig|''hozwārishn'']]}} → [[[Frahang-i Pahlavig|hozwārishn]]] Error: {{Lang}}: text has italic markup (help)
On this page, this does bold and 'single quotes' (non-italic), and it renders (the same way) in mainspace: {{lang|arc-Latn|''[[Frahang-i Pahlavig|hozwārishn]]''}} → [hozwārishn] Error: {{Lang}}: text has italic markup (help)
Bold and 'single': {{lang|ar-Latn|''shamia''}} → [shamia] Error: {{Lang}}: text has italic markup (help)
Non-italic: ''{{lang|ar-Latn|shamia}}'' → shamia
None of the {{lang|foo-Latn}} instances should auto-italicize, since {{lang|es}}, etc., do not; only the {{lang-foo}} templates emit italics around Latin script by default.
I think the latter problem is the same as the one reported by Peacemaker67 above, but the namespace problem may be something new. PS: I assume the "bold and 'single'" problem is fixed in the Lua by doing italics directly instead of by wiki ''...'' markup. — SMcCandlish☏¢ >ʌⱷ҅ᴥⱷʌ< 05:29, 28 November 2017 (UTC)
I took a look at the source code. ''[[Frahang-i Pahlavig|{{lang|arc-Latn|hozwārishn}}]]'' results in ''[[Frahang-i Pahlavig|<span lang="arc-Latn">''hozwārishn''</span>[[Category:Articles containing Aramaic-language text]]]]'' in mainspace. Outside of mainspace, the category isn't included, so the link works, and oddly the italics don't cancel out. (Maybe because they have no displayed text between them.) Three ideas: a |nocat= parameter to remove the category, or a |link= parameter to provide the article that the text should link to, or require that the link be placed inside {{lang}}: ''{{lang|arc-Latn|[[Frahang-i Pahlavig|hozwārishn]]}}''. — Eru·tuon08:19, 28 November 2017 (UTC)
I have taken the liberty of changing the list markup above from unordered to ordered.
Items 1 & 2 render as they do for the reason given by Editor Erutuon: a category link inside a wikilink does not work.
Item 3 renders in italic font because the language code specifies a Latin script
Item 4 renders in upright font because Latn script italics reversed by external wiki markup
Item 5 renders in italic font because, I suspect, while there is something between the italic markup in the wikilink and the italic markup provided by the template, that 'something' is not displayable text so the markup is not displayed
Item 6 renders as it does because the template applies italic markup (from the Latn script) to displayable text already wrapped in italic markup
From the very beginning of this experiment, Module:Lang has supported |italic= so you can write:
|italic= always overrides any automatic italic setting.
{{lang}} only auto italicizes when the IETF language tag tells it to italicize with a Latn (case insensitive) subtag.
Those {{lang-??}} templates that have been converted to use Module:Lang, emit error messages when the 'text' to be rendered includes italic wiki markup (presumably to override the wiki markup included in the wikitext templates). That same error message code is available to {{lang}} but is disabled for now because all of the unconverted {{lang-??}} templates call {{lang}} and many of them have hard-coded italic wiki markup which would cause an error message for each of the rendered {{lang-??}} that has hard-coded italic wiki markup.
Expected behavior of {{lang}} versus {{lang-xx}} is that the former will always produce non-italic output; it's too often used for proper names which are not italicized in most contexts. Having it switch to producing italicized when particular language codes are used will confuse and result in wrong output; pretty much no one will remember that one particular kind of case is going to produce different output. We do expect {{lang-xx}} to italicize unless it's non-Latin material, so this auto-handling of -Latn would make sense.
Another thing: the prescribed method of representing italics within material already italicized is to turn the italics off for nested-italics. The obvious way to do this is ''Blah Blah ''Foo'' Yadda Yadda''; MoS probably actually advises this somewhere. But that's going to work with the Lua version if it spits an error instead. What to do about this?
On the other matter: The purpose of these templates is proper markup and formatting of text. If insertion of an optional and possibly redundant category by some of them is causing these central purposes of the markup to fail, then there needs to be a parameter for adding a category not for removing one - it should not be done by default. Who actually uses "Category:Articles containing Foo-language text" and for what? That's a maintenance/tracking category type, not a category for readers. It could be made invisible or even (old-school style) be moved to the article's talk page. In the end it would make more sense for a bot to detect when a particular {{lang|foo}} etc. has been used in article, using a list of templates and parameters that do this sort of stuff, then add the categories. We don't really need to the templates to do it at all. — SMcCandlish☏¢ >ʌⱷ҅ᴥⱷʌ< 00:43, 30 November 2017 (UTC)
{{lang}}, in most contexts, will not produce italic rendering. Only in the special case, where the editor has taken the trouble to explicitly specify a Latin script by including the Latn IETF subtag, will the module apply italic markup.
Yes, the purpose of these templates is proper markup and formatting of text, which statement I would clarify: text supplied to the template; the template cannot know what exists outside the opening {{ and closing }}. The insertion of categories does break the wikilinks for your examples 1 & 2 above. This breakage was also true for the old, non-module version (I recreated Test page with your examples, selected this version of {{lang}}, clicked edit, typed Test page into the Preview page with this template text box, and clicked the adjacent Show preview button). From this experiment, I believe that the Module version of {{lang}} breakage of examples 1 & 2 is not new and existed in the old template. I do not know why the templates add hidden categories, nor do I know if no one uses them or if a lot of editors use them. Clearly, someone thought it important. If you wish to do away with these categories, this is probably not the correct venue.
Oh, I don't care if the categories exist, they just shouldn't be added by these templates if breaks the main purpose of the templates [sometimes]; a bot can do the categorizing instead. — SMcCandlish☏¢ >ʌⱷ҅ᴥⱷʌ< 07:56, 10 December 2017 (UTC)
Formatting of first line of multiline text
I ran across this problem at Jacques Dutronc#Discography. When there are multiple lines of foreign language text, the wiki syntax of the first line is not properly displayed.
The template seems to have been used without change on the above page for many years, so I assume that something's changed with the template, rather than the article having been wrong for all that time.
Here's a simple example, where the first bullet-point is shown as a standard asterisk, not as a list item:
een
twee
drie
Is there a problem with the template, or is this now the expected behaviour? (I'm not sure it's really being used properly on that page anyway, but maybe that's a different matter.) --David Edgar (talk) 00:28, 22 November 2017 (UTC)
This is an issue with all templates. The fix is to do this {{lang|nl|<nowiki />. However, the template shouldn't be used this way, but should be used for the exact content in the other language: *{{lang|nl|een}}, etc. — SMcCandlish☏¢ >ʌⱷ҅ᴥⱷʌ< 00:49, 22 November 2017 (UTC)
I use {{Lang|xx|...}} often for multi-line text, and this report gave me a fright. Turns out, the behaviour as described seems to be triggered by * (and similar), and surprisingly goes away when <poem>...</poem> is used:
The problem is twofold: one part is that the MediaWiki parser treats the "*" character as invoking a list only if it occurs in certain defined positions, such as the start of a line; the other part is that the module underlying this template strips leading (and trailing) whitespace, which includes newlines. So you need an initial newline, but also need to prevent that from being stripped as whitespace - all that you need is to hide that newline using a character which doesn't test as whitespace, so won't be stripped off; yet one which won't be visible when rendered by the browser. The entity for a space is ideal:
It might be possible to add a hack to fix this in Lua. Say, if the text starts with an asterisk and contains a newline (or newline plus asterisk), assume it's a bulleted list and add a newline at the beginning. That might result in unintuitive behavior in certain cases, though. — Eru·tuon22:02, 27 November 2017 (UTC)
Doesn't have anything to do with * in particular, though. This affects all wikimarkup the effect of which is triggered only by newline followed directly by a special character (#, ;, :, probably others), or by newline then HTML comment then special character. — SMcCandlish☏¢ >ʌⱷ҅ᴥⱷʌ< 00:15, 28 November 2017 (UTC)
That's right. The logic could easily be modified to accommodate #, ;, : along with *. (I think that's all of the list-ish special characters.) It would also be possible, but more costly, to accommodate HTML comments after the newline. I'll test the idea in Module:Lang/sandbox. — Eru·tuon01:03, 28 November 2017 (UTC)
As an FYI about this thread, there is a secondary issue with the above invocation of this template: <span><ul><li></li><li></li></ul></span> is bad HTML and will not render to HTML correctly on MediaWiki at some point in the future, defeating the purpose of the template (it adds items in the misnested HTML5 or the general misnesting lint errors). Since this template presently generates a span, "detecting the issue and fixing it for these limited cases" doesn't fix a secondary issue of the invocation above (and additionally is inconsistent with every other template that does something like the proposed change). I would suggest that Template:Lang should be able to output a <div></div> rather than a span (while the majority of cases are inline, I've seen quite a few where a block lang template would be helpful--poetry is one). --Izno (talk) 20:58, 28 November 2017 (UTC)
If a block lang version of this template (or some sort of switch that flips this template from default inline to block) is required, then you should add that to the feature request list. Now, while basic functionality is still at issue, new features should not be getting in the way.
Can be replaced with {{lang-xx|italic=yes|All the text inside is italicised}}
Or sometimes {{lang-xx|italic=no|All the text inside is unitalicised}} when "xx" is written in Latin script in the first place
{{lang-xx|Тэкст виФ транскрипцион ''Text with transcription''}}
Can be replaced with {{lang-xx|Тэкст виФ транскрипцион}} {{lang|xx-Latn|italic=yes|Text with transcription}}
Could also use "translit" parameter, though that introduces extra WP:LEADCLUTTER which could be undesirable in some cases
{{lang-xx|Name1 ''or'' Name2 ''or'' Name3}}, where the "italics" markup on "or" is actually intended to de-italicise
Could be replaced with {{lang-xx|Name1}} or {{lang|xx|italic=yes|Name2}} or {{lang|xx|italic=yes|Name3}}
But I suspect bot regex replacement wouldn't be safe, probably there's similar-looking cases where something else is intended
Other stuff which will require manual intervention
How shall we go about clearing this backlog? Should I go to Wikipedia:Bot requests, or is someone handling this already? (I think at least the first two cases can be handled by bots, allowing human effort to be focused on more difficult cases). Cheers, 59.149.124.29 (talk) 05:12, 11 December 2017 (UTC)
Thanks for fixing what you've fixed. Unfortunately it isn't quite as you describe.
{{lang-??}} templates do not all default to italic rendering so the italic markup might have been used to negate the default italic or been used to force italics.
it is not always clear that the 'text with transcription' you refer to is a transliteration or is a restatement of the text written in the language's 'other' (often Latin) script (Serbian uses both Cyrillic and Latin, for example; there are quite a few others). When the italicized text is a transliteration, and the static text provided by the {{lang-??}} template is not desired, perhaps a better choice is to use the more semantically correct {{transl}} template.
yeah, I think this is the correct solution assuming that the script used for the non-English text is supposed to be italicized. The rather larger issue with your example is that the original template mixes the English 'or' with the non-English text which is counter to the underlying html markup that identifies all text in {{{1}}} as the non-English language.
I suspect then that fixes, rather than being general, are perhaps easier if done on a per-template basis. Recently I fixed several hundred instances of {{lang-so}} which by default rendered text in an upright font even though Somalian is written primarily using a Latin script (this is a case where the template should have been fixed long ago, rather than editors 'fixing' each instance to italicize). I fixed the template to italicize and then used a simple search and replace regex in AWB:
I suspect that something similar may work for a lot of other templates. Of course, hundreds of editors each fixing the templates on pages that they care about could go a long way to clearing the error category. I know, that's being overly optimistic.
I just leave it as experimental feature for test porpose that anyone can try it. Since you asked me to explain now, I decided to made a test case for this.
For a while that error may have been there, for all users, not just logged out users – that particular kind of error does not discriminate against logged in vs logged out. But, {{lang|is|striped lady cake}} and the other {{langx|is|Vienna cake}} a improper uses of the templates. Both 'striped lady cake' and 'Vienna cake' are clearly English, not Icelandic, so should not be marked up as such.