Categories
BlogSchmog Of Course

Translation the Wiki Way

The need for multilingual wiki content is growing. The means to maintain translations for multiple languages and ever-changing content, however, is left to humans and clumsy interfaces. A project by Alain Desilets, Lucas Gonzalez, Sebastien Paquet and Marta Stojanovic for the Institute for Information Technology (IIT) of the National Research Council (NRC) recognizes that not everyone is fluent in English and looks for a way to implement evaluate lightweight tools and processes for translating collaborative content.

Traditionally, translation occurs in one of three ways:

  1. Sequential Translation — The text is written in English first and edited to the point of being 100% accurate. Only then is it translated. This is the most popular approach, particularly with large corporations needing multi-lingual versions of technical manuals. The process is very controlled, but it is too regimented and slow for wiki. Also, not all contibutors can be expected to be fluent in English.
  2. Parellel Authoring — This is the model used by Wikipedia, in which the translators are all briefed at same time but write independently. The plus is that the content is written quickly for a particular audience from start. However, there is very little synergy between languages. Each new version essentially re-invents the same wheel.
  3. Incremental Just-In-Time Translation — In this model, requests for translation are issued automatically as a change occurs. There is definite synchronization between languages, even with fast-changing content. But this, too, assumes a “master” language exists (ex: English). Machine translation today is weak, so the quality of the translation is also called into question.

The TTWW team concludes that a new process is needed, one that is built from the end-user perspective. An early prototype uses LizzyWiki to deal with bilingual content. (The NRC-IIT previously did studies on storytelling in primary schools and wiki usability.) LizzyWiki is based on the original perl wiki engine written by Ward Cunningham. The project examines the tool from the vantage point of Visitors (those interested primarily in reading), Authors (those interested in creating and changing content), and Translators (those interested in translating existing content).

In this system, a warning message is displayed if the current page has a more recent version of the content in another language. The readers can use this as a guide about the state of the content, but translators can also treat this as a request to translate a recent change. A new page can be first written in the language most comfortable to the author. The tool will even check other languages to see if the content is already available elsewhere (it shows the children of a translated parent without a translation itself). When translating, the two languages appear in side-by-side text boxes so the translator always has a handy reference to the original text.

The shortcomings are that TTWW doesn’t scale beyond two languages very well. Machine translations (“crappy at the moment”) can help show the changes in translated state. Even if what the computer generates isn’t perfect, it might be enough to understand the differences and help the translator proceed. This tool distances itself from the notion of English as a master language but the researchers concede that a “pivot” language (probably English) could still be needed to avoid a game of telephone, where the meaning gets distorted as translations are subsequently translated.

For more information, see WikiSym abstract or download the paper.

By Kevin Makice

A Ph.D student in informatics at Indiana University, Kevin is rich in spirit. He wrestles and reads with his kids, does a hilarious Christian Slater imitation and lights up his wife's days. He thinks deeply about many things, including but not limited to basketball, politics, microblogging, parenting, online communities, complex systems and design theory. He didn't, however, think up this profile.