Google Working with Wikipedia to Translate 'Smaller Languages'
Google July 14 said it is working with Wikipedia contributors, translators and "Wikipedians" (I assume these are users) across India, the Middle East and Africa to translate more than 16 million words for Wikipedia into Arabic, Gujarati, Hindi, Kannada, Swahili, Tamil and Telugu. While Wikipedia has the so-called common languages English, German and French covered with millions of articles, there is a paucity of pieces translated into the aforementioned smaller languages. The motive is obvious; Google and Wikipedia have a shared interest in organizing information and content and making it easily consumable for Web users all over the world. Google has made great progress with Hindi, arguably one of the larger smaller languages, but the method is not so obvious. Google apparently uses Google Trends to pinpoint content and then the Translator Toolkit to translate it for Wikipedia. Google Product Manager Michael Galvez explained the company's method for picking which Hindi Wikipedia articles get translated:
Google then washed, rinsed and repeated for other smaller languages to bring its total number of words translated to 16 million. Pretty impressive, right? Check out the graph for the number of non-stub Wikipedia articles by Internet users:
See more info here, which Galvez presented at Wikimania in Poland last week. There was a time when information barriers were inherent and assumed thanks to language gaps. That time is coming gradually to a close, thanks to Google's translation efforts. There's something exciting and a little scary about the universalization of Google and Wikipedia. Of course, Google has a long way to go because machine translation is an imprecise practice and a tough nut to crack. |

Comments (1)
Uhh, why did google trends help in any way?
stats.grok.se would be more useful.
Also computer translation generally sucks, and wikipedia tends to self-adapt to high traffic articles anyway.
Posted by Anony Moose | July 16, 2010 7:59 AM