The Star Malaysia

Google wants more native language content

- By GABEY GOH bytz@thestar.com.my

KUALA LUMPUR: Google is urging Malaysians to publish more content in their native languages to help improve Google Translate service, which is a free service provided by Google to translate a section of text, document or webpage, into another language.

Google research scientist Ashish Venugopal admits that the system is not perfect and said that one way people can help improve it is to publish more high quality content in their native languages to provide more data for the translatio­n models to work with.

“It got to a point where it worked so well on average that our expectatio­ns changed and we expect it to be perfect all the time. But by putting more content online, especially bilingual documents, it enriches the source data and allows the system to learn,” he said.

The service leverages on statistica­l machine translatio­n, which involves the pattern matching of documents, which have existing translatio­ns hosted online.

Ashish added that due to the fact that it is an automated process, machines cannot discern the difference in levels of errors, which would be glaring to humans.

The challenges faced in ensuring accurate translatio­ns include the complexity of the language, how much available material exists and the ease of finding a correspond­ing word between documents.

Ashish said that traditiona­l translatio­n methods to date haven’t been able to scale up to the explosion of content and languages on the Internet, a problem Google Translate wishes to address via automated translatio­ns.

“It’s not a question of which is better, one just scales up better than the other,” he said, adding that the services currently supports 64 languages.

While not intended to be a replacemen­t for quality translatio­ns done by people, the mission for Google Translate is to “break down the language barriers” online.

“We want to be able to translate the easy stuff so that profession­als can focus on the high value translatio­ns. There is a big difference in translatin­g what was said and being able to convey nuances such as sentiment and emotion,” he said, adding that a toolkit has also been released for profession­al translator­s and linguists.

When asked how explosive was the growth of non-english content, Ashish shared that in 2010, Chinese-language content accounted for 22% of web content with a growth of 277%, while Arabic accounted for 3.3% of web content but has experience­d an explosive 2,000% growth.

 ??  ?? MULTI-LINGUAL: Google wants Malaysians to publish more content in their local languages.
MULTI-LINGUAL: Google wants Malaysians to publish more content in their local languages.

Newspapers in English

Newspapers from Malaysia