<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>blog.i18n.ro &#187; standards</title>
	<atom:link href="http://blog.i18n.ro/tag/standards/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.i18n.ro</link>
	<description>Sorin&#039;s personal blog and website</description>
	<lastBuildDate>Fri, 06 Jan 2012 18:05:02 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Using the simple language and locale codes</title>
		<link>http://blog.i18n.ro/simplified-locale-codes/</link>
		<comments>http://blog.i18n.ro/simplified-locale-codes/#comments</comments>
		<pubDate>Fri, 09 Apr 2010 15:06:42 +0000</pubDate>
		<dc:creator>sorin</dc:creator>
				<category><![CDATA[i18n]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[standards]]></category>

		<guid isPermaLink="false">http://blog.i18n.ro/using-the-proper-language-codes/</guid>
		<description><![CDATA[How to choose the proper language and locale codes when localizing? If you localize only for the macro-language use the macro language code. If you have only one English use just &#8220;en&#8221; code but if you have more than one &#8230; <a href="http://blog.i18n.ro/simplified-locale-codes/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<h2>How to choose the proper language and locale codes when localizing?</h2>
<p>If you localize only for the macro-language use the macro language code. If you have only one English use just &#8220;en&#8221; code but if you have more than one &#8220;en&#8221; will mean &#8220;en-US&#8221; and you will have to add more detailed codes like &#8220;en-CA&#8221;. Unicode website gives more details in <a href="http://cldr.unicode.org/index/cldr-spec/picking-the-right-language-code">picking the right language code</a> article.<span id="more-331"></span></p>
<p>The table below shows language codes that I would recommend you to use. The principle is simple: use the simplest code that does not generate confusion. For languages not listed here it is safe to consult <a href="http://cldr.unicode.org/index/cldr-spec/picking-the-right-language-code">this article</a>. If you have questions please email me, I will investigate and complete the below table with other cases.</p>
<div>
<table style="border-collapse: collapse;" border="0">
<colgroup>
<col style="width: 133px;"></col>
<col style="width: 127px;"></col>
<col style="width: 221px;"></col>
<col style="width: 161px;"></col>
</colgroup>
<tbody>
<tr>
<td style="padding-left: 7px; padding-right: 7px; border-top: solid #4f81bd 1.0pt; border-left: none; border-bottom: solid #4f81bd 1.0pt; border-right: none;"><span style="color: #376092;"><strong>Language</strong></span></td>
<td style="padding-left: 7px; padding-right: 7px; border-top: solid #4f81bd 1.0pt; border-left: none; border-bottom: solid #4f81bd 1.0pt; border-right: none;"><span style="color: #376092;"><strong>Recommended code<br />
&#8220;canonical&#8221;</strong></span></td>
<td style="padding-left: 7px; padding-right: 7px; border-top: solid #4f81bd 1.0pt; border-left: none; border-bottom: solid #4f81bd 1.0pt; border-right: none;"><span style="color: #376092;"><strong>Explanation</strong></span></td>
<td style="padding-left: 7px; padding-right: 7px; border-top: solid #4f81bd 1.0pt; border-left: none; border-bottom: solid #4f81bd 1.0pt; border-right: none;"><span style="color: #376092;"><strong>Other valid codes</strong></span></td>
</tr>
<tr style="background: #d3e0ef;">
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;"><strong>Chinese (Simplified)<br />
<em>generic</em></strong></span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">zh</span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">This is a macrolanguage. It does map to Chinese (Simplified) language (sometimes referred as Mandarin) because this is the predominant language. [<a href="http://meta.wikimedia.org/wiki/Automatic_conversion_between_simplified_and_traditional_Chinese">ref1</a>]</span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">zh-hans<br />
</span>&nbsp;</p>
<p><span style="color: #376092;">zh-cn<br />
</span></p>
<p><span style="color: #376092;">zh-sg<br />
</span></p>
<p><span style="color: #376092;">zh-hant-cn<br />
</span></p>
<p><span style="color: #376092;">zh-hant-sg</span></td>
</tr>
<tr>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;"><strong>Chinese (Traditional)</strong></span></td>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;">zh-hant</span></td>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;">It is a good idea not to specify the region because there are several regions where Traditional Chinese is present so specifying only the script is better.</span></td>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;">zh-hant<br />
</span>&nbsp;</p>
<p><span style="color: #376092;">zh-tw<br />
</span></p>
<p><span style="color: #376092;">zh-hk<br />
</span></p>
<p><span style="color: #376092;">zh-mo<br />
</span></p>
<p><span style="color: #376092;">zh-hant-tw<br />
</span></p>
<p><span style="color: #376092;">zh-hant-hk<br />
</span></p>
<p><span style="color: #376092;">zh-hant-mo</span></td>
</tr>
<tr style="background: #d3e0ef;">
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;"><strong>English (US)<br />
<em>generic</em></strong></span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">en</span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">You should use this code if you have only one English translation or if this is American English. American English is the predominant language so the &#8220;en&#8221; code will auto-map to en-us.</span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">en-us</span></td>
</tr>
<tr>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;"><strong>English (UK)</strong></span></td>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;">en-gb</span></td>
<td style="padding-left: 7px; padding-right: 7px;"></td>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;">en-uk (just for compatibility)</span></td>
</tr>
<tr style="background: #d3e0ef;">
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;"><strong>Portuguese (Brazilian)<br />
<em>generic</em></strong></span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">pt</span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">Use pt instead of pt-br to enable easy fallback when you do not have a <strong>pt-pt</strong> translation.</span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">pt-br</span></td>
</tr>
<tr>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;"><strong>Portuguese (Portugal)</strong></span></td>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;">pt-pt</span></td>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;">Do not use just &#8220;pt&#8221; if you have translations for both Brazilian Portuguese and Portugal Portuguesse.</span></td>
<td style="padding-left: 7px; padding-right: 7px;"></td>
</tr>
<tr style="background: #d3e0ef;">
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;"><strong>Romanian</strong></span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">ro</span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">Romanian has a ISO 639-1 code and there no need to use a more complex code like ones specified in 639-2 or 639-3</span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-right: none;"><span style="color: #376092;">ro-ro<br />
</span>&nbsp;</p>
<p><span style="color: #376092;">ro-latn-ro<br />
</span></td>
</tr>
<tr>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;"><strong>Spanish (Spain)</strong></span></td>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;">es</span></td>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;">es-es is considered the predominant language. </span></td>
<td style="padding-left: 7px; padding-right: 7px;"><span style="color: #376092;">es-es</span></td>
</tr>
<tr style="background: #d3e0ef;">
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-bottom: solid #4f81bd 1.0pt; border-right: none;"><span style="color: #376092;"><strong>French (France)</strong></span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-bottom: solid #4f81bd 1.0pt; border-right: none;"><span style="color: #376092;">fr</span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-bottom: solid #4f81bd 1.0pt; border-right: none;"><span style="color: #376092;">fr-fr is considered the macro-languages.</span></td>
<td style="padding-left: 7px; padding-right: 7px; border-left: none; border-bottom: solid #4f81bd 1.0pt; border-right: none;"><span style="color: #376092;">fr-fr</span></td>
</tr>
</tbody>
</table>
</div>
<p>For most languages it will be safe to use the two letters code, this will work without problems for Arabic (<strong>ar</strong>), Czech (<strong>cs</strong>), Danish (<strong>da</strong>), German (<strong>de</strong>), Greek (<strong>el</strong>), Finnish (<strong>fi</strong>), Hebrew (<strong>he</strong>), Hungarian (<strong>hu</strong>), Italian (<strong>it</strong>), Japanese (<strong>ja</strong>), Korean (<strong>ko</strong>), Norwegian (<strong>nb</strong>), Dutch (<strong>nl</strong>), Polish (<strong>p</strong>l), Romanian (<strong>ro</strong>), Russian (<strong>ru</strong>), Swedish (<strong>sv</strong>), Turkish (<strong>tr</strong>), Ukrainian (<strong>uk</strong>).</p>
<h2>Matching languages codes</h2>
<p>You have an application localized in a number of languages and the system (OS or browser) is reporting you one or more language codes that do not exactly match your list. How do you make an optimal selection for this?</p>
<p>I think that it would be wise to reuse the same <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html">language tags</a> from HTML specification.</p>
<p>In case you are doing a browser application you will get a more detailed information (<a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html">see RFC2616</a>) about language preferences of the user: &#8220;en, es, de, ja, zh-TW&#8221; or even with preference factor (0.0-1.0) &#8220;en, es;q=0.8, de;q=0.7, ja;q=0.3, zh-TW;q=0.1&#8243;</p>
<p>So the only remaining problem is that you need to make a proper matching between what you have and what the system reports. The matching is not very simple because usually you don&#8217;t know the exact form reported by the system. Codes like &#8220;zh-TW&#8221; should map to &#8220;zh-hant&#8221; and &#8220;zh-CN&#8221; or &#8220;zh-hans&#8221; should map to &#8220;zh&#8221; (Simplified Chinese). Also mapping &#8220;zh-TW&#8221; to &#8220;zh&#8221; is not allowed even if you have only one Chinese translation available.</p>
<p>Soon, I will complete this article with a matching algorithm implemented in Python so anyone could port it to his own language.</p>
<h2>Resources</h2>
<ul>
<li><a href="http://en.wikipedia.org/wiki/IETF_language_tag">IETF language tags – defined by BCP-47 (used by HTTP, HTML, XML,…)</a></li>
<li><a href="http://www.w3.org/International/articles/language-tags/Overview.en.php">Language tags in HTML and XML (W3C)</a></li>
<li><a href="http://rishida.net/utils/subtags/">Language Subtag Lookup (W3C)</a></li>
<li><a href="http://www.langtag.net/registries.html">IANA listings of codes for languages, scripts, regions, variants and redundant codes in text, xml and SQL format (langtag.net)</a></li>
<li><a href="http://www.thefutureoftheweb.com/blog/use-accept-language-header">PHP code that detects and selects proper languages based on Accept-Language header</a></li>
<li><a href="http://www.ethnologue.com/web.asp">Ethnologue.com</a> – encyclopedic reference of languages.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.i18n.ro/simplified-locale-codes/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>i18n mistake #1: Using images for representing languages</title>
		<link>http://blog.i18n.ro/i18n-mistake-1-using-images-for-representing-languages/</link>
		<comments>http://blog.i18n.ro/i18n-mistake-1-using-images-for-representing-languages/#comments</comments>
		<pubDate>Sat, 17 Jan 2009 01:32:53 +0000</pubDate>
		<dc:creator>sorin</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[mistakes]]></category>
		<category><![CDATA[standards]]></category>

		<guid isPermaLink="false">http://blog.i18n.ro/?p=145</guid>
		<description><![CDATA[Clearly adding images for representing languages is not the most important internationalization issue someone can make. In fact I added #1 because this is the first internationalization mistake I decided to blog about and I want to document many more &#8230; <a href="http://blog.i18n.ro/i18n-mistake-1-using-images-for-representing-languages/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Clearly adding images for representing languages is not the most important internationalization issue someone can make. In fact I added #1 because this is the first internationalization mistake I decided to blog about and I want to document many more mistakes in the future.<span id="more-145"></span>As a quick solution would be to use images with <a href="http://en.wikipedia.org/wiki/ISO_639-1">ISO 639-1</a> language codes like: en, fr, de, es, ja. But you should be aware that not all languages can be represented using ISO 639-1 and you may be required to use ISO-639-3 or simpler to add a two-letter subcode using country code <a class="normref" rel="biblioentry" href="http://www.w3.org/TR/html401/references.html#ref-ISO3166">[ISO3166]</a>  like en-us or en-gb</p>
<p>Also check the list every time you add a language because it&#8217;s <strong>another common mistake to assume that two letter domain names or country codes are the same as the language codes</strong>.</p>
<p>An even a better idea could be to not use images at all <img src='http://blog.i18n.ro/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
<p>You may find additional resources at: <a href="http://www.sil.org/iso639-3/codes.asp?order=639_1">sil.org</a> or at <a href="http://www.sil.org/iso639-3/codes.asp?order=639_1">W3C &#8211; using lang<br />
</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.i18n.ro/i18n-mistake-1-using-images-for-representing-languages/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using apc
Page Caching using apc

Served from: blog.i18n.ro @ 2012-02-07 16:52:33 -->
