Google has just announced more improvements to their free translation services. To celebrate, we are releasing the best way to introduce Google Automated Translation to your website, using a self-updating drop-down list with a few featured languages.
Google is now offering free translation services in 51 languages, that is, over 98% of all users. That is simply magnificent! The newest improvements, detailed in the blog entry, include text-to-speech (hear the translation spoken to you) and what they call “romanization” (i.e., show a phonetic transcription of the word). Also, Google Translate has been implemented into Google Docs a while back.
If you speak a language other than English, there’s a good chance that you can use Google Translation to translate virtually any webpage or document into your language. Yet less than 2 years ago, the automated translations situation was rather sad. Most systems had been licensed from Systran and were based on fixed rules. Google turned this microcosm upside down introducing statistics-based translations on a large scale. Though they are still mostly erroneous, in time and while being corrected they will improve.
Most state-of-the-art, commercial machine-translation systems in use today have been developed using a rule-based approach, and require a lot of work to define vocabularies and grammars. Our system takes a different approach: we feed the computer billions of words of text, both monolingual text in the target language, and aligned text consisting of examples of human translations between the languages. We then apply statistical learning techniques to build a translation model. (1)
As soon as Google had introduced statistical translations, Microsoft gave up Systran, adopting the Google model (2). Windows Live Translator explains quite well automated translations limitations (3):
automatic translation enables you to understand the gist of foreign language text, but is no substitute for a professional human translator if fluency is required
Here’s the video featured in today’s announcement:
People who are not fluent in English form the majority in this world, and they are learning to use the Internet at a faster pace than they learn English. Google has offered a small and unobtrusive widget you can use on a website in the form of a drop-down list. The problem with it is that it’s only textual, no flags are shown. Though some people (6 Hietaniemi) have argued that flags are no a good, my view is that one can feature a few languages by flags, giving the user the opportunity to translate in even more languages using the Google Translation widget. While a non-native speaker might overlook the Google Translate drop-down list, the flags should be more visible and instantly recognizable.
Here is the code for the widget that can be seen on this page. You are free to use the code on your page. I will also explain how to further customize it. Unlike many other similar widget, this one has its flags hosted by Google. This means that the flags will not slow down your webpage and, if more people use and implement this widget, the flags will also be cached and load faster.
In the code above, I am featuring several languages, based on where my visitors come from. You may have a different demographic, and as such, may choose to use different countries. The basic repeating flag unit is as follows:
<input onclick="this.form.langpair.value=this.value" title="
Local/ Language" value="en| Code" type="image" height="20" src="http://www.google.com/images/flags/ Flag_flag.gif" width="30" name="langpair"/>
The substitutions are straightforward, just make sure you do not confuse the Language
Code with the Flag Code, they are sometimes the same but mostly different. Also, sometimes you can use different flags for the same language. For instance, Portuguese is spoken in both Brazil and Portugal. The same is true for Romanian, which is spoken in Romania and Moldovan Republic and so on. It seems that Google is not hosting a flag for Catalan, even though it is offering an interface and translation into that language, so I would use the Spanish flag, or a locally hosted flag, if that is important for you.
|English||jolly cup o’tea||uk (us)||en|
For example, if you would like to feature Vietnamese, you would substitute with the following results:
<input onclick="this.form.langpair.value=this.value" title="tiếng Việt / Vietnamese" value="en|vi" type="image" height="20" src="http://www.google.com/images/flags/vn_flag.gif" width="30" name="langpair"/>
You could further customize it by changing the size of the flag, currently at 30x20. Other good pairs of width and height are 50x34 (original and larger), 75x51 (very large), 25x17 (smaller), 10x15 (barely visible).
If your website is, unlike this one, in a language other than English, you might want to consider featuring only the English language with a flag and in rest the regular Google widget. This is because translation between languages other than English is a 2-step process, routing through English. For example, to translate from German to Chinese, Google will first translate from German to English, then from English to Chinese. Since automated translation is usually ridden with errors, the 2-step process amplifies the errors, increasing the chances of a poor translation.
Even with English as one of the “end languages”, translation is not all that accurate, and this is perfectly illustrated by Translation Party (9). This site will translate some text back and forth until it no longer changes – i.e., it has been successively dumbed down until it no longer makes sense and can no longer be “stupified”. This state of grace is called “translation equilibrium”.
To adapt the above widget for your own website, you first have to change <input value="en" name="hl" type="hidden"/> which can be found at the top, right underneath the first document.write. Change “en” to the language
code in the table above corresponding to your source language. You will also have to change for every language pair en with the same code. The quickest way to accomplish this is to load the text in Notepad, then do a Search & Replace for en. Below, you may find an example of what you get when you substitute “en” with “ro”:
Don’t forget to include the Google Drop-down list code, changing “ro” with your website source language (you may also change the height and width as you see fit):
Alternatively, you can further customize the drop-down code by going to (7-google-lang-tools) in Sources / More info.
Like the Romanian version, this code is licensed with a Creative Commons Attribution license – which means that you can modify it as you see fit as long as you leave the invisible comments inside unmodified. A visible link to this blog is appreciated, but not necessary.
If you are concerned that your secrets are vulnerable to the onslaught of translation bots, you can enclose them in class="notranslate", e.g., <span class='notranslate'>Whatever you don’t want translated</span>. You can even forbid Google to translate an entire page by using <meta name="google" value="notranslate"> in the header.
The more people use Google Translate and the more documents are fed into it, the better it becomes, so spread the news and the knowledge!
Sources / More info: 0-traduceri-zamolxis, 1-google-faq-trans, 2-msdn-ms-trans, 3-WL-trans, 4-gs-more-lang, 5-lang-tools, 6-hietaniemi-no-flags, 7-google-lang-tools, 8-google-blog-new, 9-translation-party, 10-five-common-gt-problems