These are notes from a breakout session at the AMTA 2010 Workshop on Collaborative Translation and Crowdsourcing, Denver, Colorado, October 31st 2010.
Session moderator: Yakov Kronov.
This breakout session focused on a cluster of issues about the use of monolingual users in translation crowdsourcing contexts. The cluster contained the following issues which were raised during the brainstorming part of the workshop:
The University of Maryland has developed two systems, MonoTrans and ParaTrans, which employ monolinguals on either side of an MT system. Error spans are flagged by target language monolinguals in the translated text. These error spans are mapped back to the corresponding passages in the source text. The problematic passages are rephrased by source language monolinguals, and then they are retranslated by the MT system. This process is repeated until the target language monolingual is satisfied with the translation.
A MITRE English-Korean chat translator ended up using a similar approach. If a message translation was not comprehended by the fellow on the other end, he could ask the originator to rephrase. In this fashion, effective dialog across language boundaries was achieved.
At Symantec, text is sent through a hybrid translation memory + machine translation system. Currently, biilingual post-editors are responsible for improving the translation while maintaining fidelity to the original. They are considering using target language monolinguals for post-editing, but have not started doing it yet. They are considering it mostly for translation of user-generated content.
What can target language monolinguals do if they can’t understand the translation?
The basic idea of retranslating paraphrases can be done without the involvement of a human source language assistant. The problem passages can be translated into another language than the source language and back. These back translations can be considered paraphrases of the original problematic passage. These are translated into the original target language for the consideration of the target language monolingual to help make sense of the original translation.
How do you evaluate the quality of the MT in this kind of context?
Traditional MT researcher measures include BLEU, TER, TERp, and METEOR. Task-based analysis—evaluating how well end users were able to use or understand the material—is an alternative that more accurately measures the ultimate goal.
How can source language monolinguals help?
First, they can use tools like acrolinx to enforce style rules and convert variant source language sentences into the standardized form, thus increasing leveraging in the source document and in translation. Second, a skilled editor can head off problems that will cause problems in translation, such as inconsistent terminology, awkward phrasing, and generally poor writing.
Session moderator: Yakov Kronov.
This breakout session focused on a cluster of issues about the use of monolingual users in translation crowdsourcing contexts. The cluster contained the following issues which were raised during the brainstorming part of the workshop:
- What are the boundaries between monolingual post-editing and bilingual post-editing?
- How can translation crowdsourcing leverage second language learners, who may be beginners in a given source or target lnaguage?
- How can translation crowdsourcing leverage native monolingual speakers of the source or target language?
The University of Maryland has developed two systems, MonoTrans and ParaTrans, which employ monolinguals on either side of an MT system. Error spans are flagged by target language monolinguals in the translated text. These error spans are mapped back to the corresponding passages in the source text. The problematic passages are rephrased by source language monolinguals, and then they are retranslated by the MT system. This process is repeated until the target language monolingual is satisfied with the translation.
A MITRE English-Korean chat translator ended up using a similar approach. If a message translation was not comprehended by the fellow on the other end, he could ask the originator to rephrase. In this fashion, effective dialog across language boundaries was achieved.
At Symantec, text is sent through a hybrid translation memory + machine translation system. Currently, biilingual post-editors are responsible for improving the translation while maintaining fidelity to the original. They are considering using target language monolinguals for post-editing, but have not started doing it yet. They are considering it mostly for translation of user-generated content.
What can target language monolinguals do if they can’t understand the translation?
The basic idea of retranslating paraphrases can be done without the involvement of a human source language assistant. The problem passages can be translated into another language than the source language and back. These back translations can be considered paraphrases of the original problematic passage. These are translated into the original target language for the consideration of the target language monolingual to help make sense of the original translation.
How do you evaluate the quality of the MT in this kind of context?
Traditional MT researcher measures include BLEU, TER, TERp, and METEOR. Task-based analysis—evaluating how well end users were able to use or understand the material—is an alternative that more accurately measures the ultimate goal.
How can source language monolinguals help?
First, they can use tools like acrolinx to enforce style rules and convert variant source language sentences into the standardized form, thus increasing leveraging in the source document and in translation. Second, a skilled editor can head off problems that will cause problems in translation, such as inconsistent terminology, awkward phrasing, and generally poor writing.